Big data: New options for implementation
Few aspects of analytics command as much attention these days as big data. “Harnessing big data to make insight-driven decisions is a business imperative in today’s digitally enabled world,” says Vince Dell’Anno, managing director, information management, data supply chain, Accenture Analytics (accenture.com). “One of the biggest mistakes companies can make is not taking the necessary steps to immediately mobilize data for exploration, and hunting to find value and opportunity from it.”
For companies that do not have the infrastructure to handle big data, however, getting started can be tough. In recognition of this challenge and also in response to the well-recognized shortage of data scientists, companies are developing new options, including self-service portals for analysis of big data and software products that help process big data.
MediaMath is a digital marketing technology company that offers a platform called the TerminalOne Marketing Operating System, which presents marketers with digital advertising opportunities and automates the execution of those opportunities. “Each day, TerminalOne connects marketers to over 120 billion opportunities to communicate with consumers with targeted, data-driven ads,” says Gareth Ouellette, VP of data products at MediaMath. “Our technology optimizes consumer interactions in real time and produces a massive amount of ‘data exhaust’ that is extremely valuable to our clients for measuring the effectiveness of their ads and campaigns.”
In recent years, MediaMath has experienced increasing client demand for direct access to the data. To address that demand, the company created a scalable data platform built on Amazon Web Services (aws.amazon.com); the data platform consumes terabytes of data and nearly 10 billion log records each day. With the scale and access challenges under control, MediaMath began testing Qubole (qubole.com), a self-service portal for big data analytics, to support the growing analytic needs both for clients and for internal use. Qubole’s analytics suite test was successful, and MediaMath rolled out the offering, first internally, then directly to clients.
MediaMath uses Qubole’s software in two distinct ways. The first involves MediaMath’s technical staff using Qubole directly, and making the results of their analyses easily accessible to business users within the company, as well as to clients. Ouellette explains, “The technical staff write SQL queries, which can be saved so business users can select from a set of pre-existing queries, working from dashboards and templates. Our product offering has standard reports, but we can also write custom reports. For this situation, we generate queries and produce data pipelines that populate the client’s dashboards.”
Meeting the challenge
In the second use case, a set of MediaMath’s clients are using Qubole directly to analyze the data they receive from MediaMath. In this case, MediaMath provides access to the data through Amazon Web Services, and the clients use Qubole to conduct whatever actions or analyses they want on the data. “Our clients told us that while they had the necessary SQL skills to analyze their data, their challenge was in maintaining and budgeting for the IT infrastructure to support their growing data needs,” Ouellette says. “By offering Qubole to our clients through MediaMath’s data platform, we eliminated our clients’ infrastructure challenge and allowed them to focus on optimizing their marketing campaigns and driving business outcomes.”
MediaMath reports that the use of Qubole has saved it several million dollars in infrastructure investment. It has met the company’s expectations for performance internally, and has also allowed MediaMath to provide a resource that its clients can use to analyze the data resulting from their ads.
Qubole was founded to provide self-service for analysis of data in Hadoop for companies that do not have the infrastructure to handle big data or prefer to assign that role to an outside company. The co-founders developed the internal analytics infrastructure for Facebook (facebook.com). “We wanted people to be able to go to a portal and look at large data sets to get insights without having to build out their infrastructure,” says Ashish Thusoo, CEO of Qubole. “By creating this platform we have changed the dynamic, so anybody and everybody can run queries.”
As a SaaS product, Qubole operates from the cloud, and all the needed resources are automatically provisioned behind the self-service portal. Companies can put their information in the clouds offered by Amazon, Google or any other cloud provider. “Qubole also interfaces well with business intelligence and data visualization tools such as Tableau and Jaspersoft, which can present the results of the analyses performed by Qubole,” Thusoo adds.
Among the leading verticals for big data use cases are media, telecom and healthcare. “Information in these industries is a gold mine,” Thusoo says, “and now that companies are feeling more confident of security in the cloud, they are more comfortable doing their analytics in that environment.” Having kept tabs on the issue of security for the past decade, Thusoo believes that most companies are convinced that cloud security is as good as or better than what they could provide for themselves. For example, AWS offers various mechanisms for security including virtual private clouds, encryption technology, etc. It also provides compliance-related capabilities such as audit trails and products that support HIPAA compliance.
Drug discovery expedited
Network pharmacology is an emerging discipline that uses network analytics to identify proteins related to diseases. Polypharmacology, which considers chemicals in terms of their action against multiple targets, is used to identify molecules that can affect the proteins. Those approaches evolved partially in recognition of the fact that diseases involve complex interactions among many biological systems, so the model for a disease and its cure should not be restricted to a single pharmaceutical acting on a single protein or condition.
Screening large numbers of compounds and comparing them to complex disease models is an analytics-intensive process. E-Therapeutics is a drug discovery company that conducts such analyses, the first stage in bringing a drug to market. The company developed a proprietary platform for network pharmacology, but as it grew, more employees were carrying out evaluation of compounds, and an increase in computational power was needed.