Big Data Revolution

Travel Industry Leveraging Big Data For Competitive Advantage

The advent of Big Data creates huge opportunities in travel-related companies’ efforts to gain consumer insights, improve process efficiencies and enhance the consumer experience. What is the deeper role of Big Data and the greater impact on pricing and revenue management? How can travel companies — suppliers and agencies — further leverage Big Data for competitive advantage?

Data — as understood in a larger, singular context encompassed under the broad-based description “Big Data” — is a key corporate asset in just about every industry. Moreover, the greater concept of Big Data is critical in most, if not all, of today’s sizable business ventures.

In fact, the enormous volume of data generated and collected today is literally staggering — equaling within about two business days the entire amount of data created throughout history up to approximately the year 2003.

No matter how it is measured, Big Data is changing the face of business. Therefore, businesses must successfully take advantage of the enormous opportunities represented by Big Data.

The travel industry — including travel suppliers, online travel agencies (OTAs) and global distribution systems (GDSs) — can access an extensive amount of data that is captured during normal business interactions across the travel value chain.

These interactions cover a wide spectrum from marketing, lead generation and interactive selling to fulfillment and customer care.

Yet, only to a somewhat limited extent have these travel industry entities — over time — captured, stored and leveraged these various data elements for competitive advantage.

It is, however, definitely possible with today’s technology to put the vast amounts of data to excellent use by providing unique insights into consumer preferences and behavior patterns to improve conversion rates and, concurrently, to significantly bolster revenues.

Today’s fast-moving business environment warrants serious consideration of Big Data’s possibilities.

In the digital world of the 21st century, most entities that play any type of role in the travel industry are experiencing enormous growth in the sheer volume of raw data needing to be captured — easily running into terabytes and petabytes (a petabyte represents a value of 1015) of information.

At the same time, the amount of data collected is growing rapidly. Each day, for example, the Sabre® global distribution system generates 7 terabytes of transaction data. And since air-shopping volumes have outpaced actual bookings during the past decade, the data collected has expanded at an ever-increasing rate.

Leveraging Big Data

Airlines have access to a vast amount of data that is obtained during regular business interactions across the travel value chain. They can leverage this data to learn customer preferences and behavior patterns and gain a competitive advantage.

Big Data Components

Big Data is characterized by volume, velocity and variety. It exists in two basic categories: structured data and unstructured data.

Examples of structured data include booking and ticketing transactions, postdeparture data and other factual information. Examples of unstructured data include usergenerated content from hotel reviews, posts on social-media sites, sensor data, audio, video, clickstreams and log files.

Insights into consumer behavior, process efficiencies and website design can be enhanced when these different types of data are analyzed together.

But the concept of Big Data — the entire breadth of the realm of data — involves much more than simply dealing with the exponential growth in data volume. It also encapsulates the tools that can be applied to process the reams of data efficiently, gain critical business insights, and help enable a corporation to be much more agile in its various actions and strategies.

Data from clickstreams, travel reviews and social-media posts are highly unstructured and basically unfeasible to store and process in a relational database management system, examples of which are represented by Oracle, DB/2 and Teradata.

The Twitter application program interface (API), for example, can capture terabytes of travel-related tweets each day to use for analysis of customer sentiment, detection of key trends and lead generation. This information can serve as impulse signals for demand forecasting.

When a company decides to capture data from operational systems to store in a data warehouse for analysts, it must first determine the proper approach for ensuring the framework allows storage of any type of data in a low-cost, scalable environment. The objective is to effectively reduce the cost of processing massive data volumes.

For its signature Web search purposes, Google developed a proprietary, distributed file system called, quite appropriately, Google File System(GFS) and a parallel programming technique and framework called MapReduce.

These Google developments had a huge role in inspiring Doug Cutting to create the Javabased Hadoop, which is based on Google’s fundamentals. The unusual name Hadoop is borrowed from Cutting’s son’s toy elephant.

Hadoop’s core components are MapReduce and the Hadoop Distributed File System (HDFS).

Hadoop, in combination with a few opensource tools that complement it, makes huge, diverse datasets readily accessible for quick analysis using clusters of inexpensive commodity hardware.

Today, Hadoop fundamentally represents an open-source revolution, considered mission-critical and widely used by the U.S. government, the National Security Agency and Web giants including Facebook, Twitter and Yahoo.

In recent years, Hadoop has evolved into something of an “ecosystem,” with many sub-projects such as Pig, Hive and HBase.

Among the newer data streams now collected is shopping data, basically consisting of the consumer’s request to fly to a destination, the shopping responses resulting from the request and the linkage of the shopping sessions to the actual booking that eventually results.

Historically, this information was stored in a data warehouse such as Teradata database management system.

Such systems excel in storing structured tables and handling a mixed workload from hundreds of simultaneous users. But note, also, the downside: Significant time is required to model the data before storing, and the cost of a machine that can store very large datasets is significant.

Calibrating models is most effective if data points are stored in a raw file-based format instead of normalizing the data into Teradata tables. Inclusions and exclusions of attributes can be made directly during the calibration process from a Big Data environment.

With HDFS, shopping logs can be economically stored in their native format. And MapReduce enables the calibration and development of advanced analytical models.

The output of these models, which is significantly smaller than the raw data, can then be loaded into a company’s operational and warehousing systems for use by a broader audience.

Relating specifically to the travel industry, then, the primary question is: What types of business challenges can a Big Data platform help a company solve?

Demand Forecasting Based On Consumer Preferences

Understanding consumer preferences requires access to shopping data, as does calibration of a consumer-choice model for forecasting demand.

This very sophisticated method for forecasting demand models the consumer’s selection process based on schedule and fare attributes.

Customer Opinions

Twitter’s application program interface captures terabytes of travel-related tweets on a daily basis. This data can be used by airlines to examine customer opinions, discover key trends and generate leads.

Optimizing Air-Screen Display

Measuring screen quality enables OTAs and traditional travel agencies to determine their effectiveness in converting shoppers into bookers.

The air-itinerary selection process can be quite daunting. In response to a typical shopping request on an OTA, shopping algorithms generate at least 1,000 outbound flight schedules, as well as around 1,000 inbound flight schedules.

Multiply 1,000 times 1,000, and the staggering result is 1 million. So there are approximately a million itinerary options, and the optimal set of itineraries that provides the best consumer alternatives must be selected.

Because online consumers do not automatically select the lowest-priced itinerary, the itineraries displayed for each shopping request should provide diversity as expressed by quality of service (non-stop, single-connect, double-connect and interline flights), fares and carriers on both the outbound and inbound schedules. These are all critical factors in the selection process.

“Screen quality” can be measured with a calibrated choice model that determines the probability that a displayed itinerary will be selected. Choice model inputs may include the selling fare and schedule attributes.

Measuring screen quality can be hugely valuable in efforts to continually improve algorithms to ensure wrong options are not displayed as well as large numbers of options that do not improve conversion rates.

To maximize conversion rates, itineraries can be ranked based on the itinerary score based on schedule and fare attributes from the choice model.

Dynamic Intervention

Big Data can support dynamic personalization by configuring pages based on recent-past behavior as revealed from close analysis of clickstream data.

The utilization of dynamic intervention relates largely to hotel and resort traffic — as opposed to flight choices. Among the most valuable and interesting analysis models are those that look at past purchase behaviors of consumers and on that basis launch dynamic on-the-spot promotions.

The goal is to significantly improve conversion rates.

Fare Forecasting

Predicting when air fares will go up or down is an inherently difficult puzzle — particularly since several factors impact the outcome, including inventory-control recommendations from revenue management, response-to-competitor fare changes and promotional fares.

Fulfilled ticket data is not a good resource for fare forecasting, since a consumer’s purchase decision may or may not have been made based on the lowest price.

Air-shopping data is an ideal source for developing machine-learning algorithms — essentially an artificial-intelligence technique to make “wait” and “buy” recommendations.

Quite logically, a “buy” recommendation would be made when fares are expected to go up, and a “wait” recommendation would be made when fares are expected to go down.

Reinforcement-learning techniques, which are neither supervised nor unsupervised, are very effective in prediction. This approach utilizes an “agent” to classify the recommendation as right or wrong, and the model “learns” and improves the accuracy of the prediction over time.

Customer-Centric Offer Management

Organic shopping data supplemented with proactive shopping data can be leveraged by airlines to attract customers and generate incremental bookings with customer-centric one-to-one offers. By capturing a customer’s intent to travel as an extension to the customer profile, in a Big Data environment, it is possible to make individual targeted offers based on intent to travel. The intent to travel can be based on pre-defined criteria such as departure date (or range), departure time window, return time window, carrier(s) preference and budget. In addition, by capturing form of payment, if a customer opts in, instead of generating a real-time one-to-one offer, the best itinerary that fits the predefined criteria can be auto booked.

Tailored Promotional Offers

Airlines can use organic and proactive data combined to interest customers and produce incremental bookings with customer-centric one-on-one promotional offers.

The “Chatter” Index

It is difficult, if not impossible, for airlines to be knowledgeable about all the significant events — either annual or one-time events — that occur within a 25-to-50-mile radius of various destinations.

An understanding of the relative popularity of these events is critical in effectively controlling seat inventory. In the absence of this information, suppliers may inadvertently run promotions, thereby unnecessarily risking revenue dilution in the process.

By picking up “chatter” from Twitter, blogs and boutique websites that target segments of the population about events such as marathons, winetastings and food festivals, these bits of information can be categorized to create a chatter index based on keywords. The keywords, in turn, can be used to alert travel suppliers (when the chatter index exceeds a predefined threshold) early in the booking cycle to protect seats for late-booking, higher-valued passengers.

Other Big Data Applications

Big Data is also highly applicable in travel for fraud and identity theft prevention by detecting, in real time, abnormal behavior patterns.

In addition, real-time collection of flight delays at airports, user-generated content and sensor data about changing weather patterns can serve as critically important early warning signals that enable airline operations to minimize flight delays and cancellations.

Proactively speeding up a delayed aircraft and delaying the departure of subsequent flight legs to preserve aircraft, crew and passenger connections can effectively minimize disruptions.

The Age Of Big Data

Big Data is the most innovative vehicle to date for providing insights into consumerbehavior patterns and improving process efficiencies that were previously quite difficult or impossible to achieve.

Forward-thinking travel suppliers, OTAs and GDSs realize that Big Data is not simply a temporary phenomenon, but a truly essential tool for gaining a competitive edge in a digital world in which numerous additional sources of new and highly valuable data are discovered daily.

Big Data will undoubtedly continue to expand exponentially for the foreseeable future.

Sabre Holdings® generates a large volume of travel data daily. It processes more than 1 billion incoming requests daily and supports peak volumes of 80,000 transactions per second.

To complement the existing Enterprise Travel Data Warehouse (Teradata) and other Data Marts (Oracle), Sabre Holdings is investing in a Big Data platform (Hadoop HDFS) by deploying the Oracle Big Data Appliance (BDA) to gain insight into consumer behavior patterns that was previously not possible. The deployment includes the Oracle BDA that is expandable to eight frames with onboard switches. Each frame has:

  • 18 Oracle Sun servers
  • Oracle Enterprise Linux 5.6
  • 40 GB/second InfiniBand
  • 10 GB/second Ethernet
  • 467 Terabytes raw storage
  • 864 GB memory

This investment in Big Data is important for airlines since it adds a new dimension to decision making that generates incremental revenues, drives operational efficiency throughout their business and delivers relevant content to their customers, enhancing customer retention and brand loyalty.

Table of Contents