Visualizing Time Series Data with Java
Session date: Jun 25, 2019 11:00am (Pacific Time)
Description: In this presentation, Daniella Pontes will talk about an application monitoring solution that has been built on top of InfluxDB in order to monitor some user events, selected application events, and error notifications. This application is being used by Web Shop Fly, a company that has grown into the big flight ticket selling travel agency in more than 5 European countries and Russia. Although the IT services are supported in parallel by Google Analytics, Hotjar or AWS Cloudwatch, InfluxDB put all the important metrics under one umbrella and provide a very exact and transparent way to find the metrics useful in UX improvements, fast response to anomalies, and order management or monitoring of quality of flight data.
Watch the Webinar
Watch the webinar “Visualizing Time Series Data with Java” by filling out the form and clicking on the download button on the right. This will open the recording.
[et_pb_toggle _builder_version=”3.17.6” title=”Transcript” title_font_size=”26” border_width_all=”0px” border_width_bottom=”1px” module_class=”transcript-toggle” closed_toggle_background_color=”rgba(255,255,255,0)”]
Here is an unedited transcript of the webinar “Visualizing Time Series Data with Java”. This is provided for those who prefer to read than watch the webinar. Please note that the transcript is raw. We apologize for any transcribing errors.
Speakers:
- Daniella Pontes, Senior Product Marketing Manager, InfluxData
- Mirek Malecha: Dir. of Product Management, Bonitoo
Daniella Pontes: 00:00:01.147 Welcome, everyone, to our webinar. This is Daniella Pontes, I’m part of product marketing at InfluxData, and I have Mirek with me from Bonitoo to share with you an incredible journey.
Daniella Pontes: 00:00:14.753 InfluxData is the company behind InfluxDB, and we provide open source time series database. But I’d like to say that what we really do is providing tools for improvement, and we do this by giving you real-time visibility into stacks, sensors, and systems. But when you go about improvement, you can have two main outcomes. One, I would say, that is when you reach excellence in terms of how you operate things. Another one is in terms of the services that you provide to the end user, making sure that it’s a good service. But there is a very important outcome of performance monitoring and improvement, which is about giving you that competitive edge that puts you away from the crowd, standing out, and making sure that your value is visible. So from my point of view, performance management is about goals, and I brought you these slides today to show you some of the companies that are our customers and use our platform. And you can see on the left side that some of them are quite large and very known in the market, but there are some different segments, and some of them are actually medium-sized. And we have here, at the bottom, you can see WEB SHOP FLY, which is a market entrant. And I put them all together to show you that they have one thing, very important, in common. All of them had a clear goal when they started their journey on performance monitoring and improvement.
Daniella Pontes: 00:02:10.949 WEB SHOP FLY, which is the one that we are going to talk about, their goal was to win customers by providing flight searching and performance experience. And when I heard that, I had that feeling, yes, of course, as a user, the experience today, I’m sorry, it’s not such a simple way. It requires a lot of interaction, and, again, put all the data in the fields, then I don’t trust what I get back. I feel the need to go and check the same information from different sources, and I still feel where that’s - I still have that feeling that I didn’t get everything that I could get, and even when I find something that I think, “Oh, that seems okay,” when I click and I proceed to purchase the flight, I get a different price. [A little savings there?]. So yes, guys, I think there’s an opportunity there. I see the need, and I’m going to be your first customer. And then I started checking exactly what they are getting into. So I thought, “Oh, my goodness. It’s a crowded market. Competition from everywhere.” There are so many different sides, so many ways that you can get your flight pricing, direct, indirect. The available frameworks don’t really give you any opportunities for differentiation. Everybody gets everything from the same kind of engine, and so the situation that you see today, also, with regard to a lot of old data out there providing the customer with price inaccuracy and it has that general feeling that it is what it is. Yeah. It’s as it should be.
Daniella Pontes: 00:04:06.182 So the consequences on top of everything, you have customers that are easily disengaged because they don’t feel any commitment from the other side in terms of providing something that they can rely on and then have that confidence that’s the best they could get. So my reaction was double-sided. The first was, yes, definitely, this is an opportunity, and I think that there is a lot to be done there. And then when I took a look at what they’ve got into, I said, “Really?” But from Bonitoo’s perspective, which is the company behind this project, their reaction was different. They said, “Yes. There is an opportunity here. Look how messy and let’s tap this environment.” And then when they looked at into the challenges that they faced, they said, “Yes. There is a lot of room,” and that sense of empowerment to say yes, it’s a challenge, but that’s where the big opportunities are. So they decided, “That’s what we are going to do because opportunity is in the eye of the beholder.” So the first thing that they started, then, to do was establish their targets, what areas they thought that would benefit most from improvement. First, the response time to search, in certain cases, over a minute, and they said, “No, it’s going to be max 12 seconds,” and that is a five-factor of improvement. That is huge and it makes a difference. I’m a witness of that. Price accuracy, they said, “We will only tolerate 5% inaccuracy.”
Daniella Pontes: 00:05:53.502 Another thing that they thought was they didn’t want just to provide data to the end user, they wanted to provide information, so they took to themselves the work to organize, to take into consideration what would be meaningful and what would be less meaningful and present that information in an actionable way for the user. Another thing is that the UI was quite passive, and I would say passive-aggressive because it’s like you need to always be providing information, and every time some results come to you and it’s not what you wanted or, again, you need to do something else, and that sense of every second that you miss is someone that is taking the best price available. So it was very passive, and they said, “Why not think from a menu perspective, and whatever information we have from the start, we already use that to present the user with shortcuts?” So from the starting point, they would already present options for destinations, keeping in mind that the premise is this, customers want to go to a nice place, have a nice vacation, why not provide some recommendation? That would help those that are open to suggestions into a good deal, and those that know exactly what they want would not be bothered.
Daniella Pontes: 00:07:19.581 Another area that they saw for improvement was with the integration between the customer experience online and the back-end support team. And then what gives you that - whenever they would detect some kind of degradation, the support team could just jump in and hold the hand of the customer and make sure that they feel that they’re taken good care of. So once they had those targets very well defined, the next important step that one must take is make sure that you get yourself the right tools. Why? You may have a good plan, but if you don’t have the tools that can take you all the way, you will just waste their time and will have to restart over again. So the first thing they decided was, “Forget about framework. We’re going direct API. We want to get information that we need, so we are not going to be there and just collect information that they make available,” so they did their own Java client instrumentation, and they wanted to make sure that they have the platform in place to collect the timestamped data in the scalability that they need. And also, if they want to make sure that back-end and front-end is integrated, they certainly need real-time, so they take InfluxDB.
Daniella Pontes: 00:08:53.612 Well, so before I steal more time there from Mirek, because I love this project - I think what they did was incredible, and they started from a place where, I think, a lot of you could be now. You are at similar situations where there is a lot of room for improvement, but the complexity of the environment, the amount of moving parts, the integration part, and that fear that - touching may break even more - keeps a lot of us just paralyzed, but taking the first step is already a good step. So now I let Mirek take you on this journey, the WEB SHOP FLY way to win. Mirek - with you.
Mirek Malecha: 00:09:36.749 Thank you, Daniella. So let me start, first of all, with a short agenda to show what I’m going to present. First of all, I will introduce the WEB SHOP FLY solution then I will continue with some short demos, so you will see the real applications, the responsive UI, instead of the passive ticketing systems that you are typically using. Then I will focus to challenges because it looks very simple from a UI point of view, but there are a lot of issues that we are facing and we have solved during the implementation. And then, we will look to the monitoring because the monitoring part is very, very important. We have to always understand what is happening and how to improve it. And then once we show the dashboards, we will focus back - or we will move back to the architecture, so how we have implemented. I will also show some short code snippet that is showing how we are integrating the application with InfluxDB. There is also some alerting architecture because we are using also the alerting capabilities, and then there is the Q&A.
Mirek Malecha: 00:11:01.786 So let me start with a brief introduction. WEB SHOP FLY, what does this do? It’s a solution. It’s a next-generation e-shop that is used for ticketing of air tickets. It has been developed by Bonitoo. That’s why there are two names. The Bonitoo has developed this from the beginning, so we designed it, we designed the architecture technology, we fully developed it, and we are still developing this solution, and we are operating this solution and supporting it. The InfluxData, as I already mentioned, is used mainly for monitoring, and they are using the monitoring for all three layers, beginning with infrastructure. We are also monitoring the internal functionality of the application, as well as we are delivering some business-level metrics to our stakeholders to understand better the value of this application. Well, it’s a little bit more complex because WEB SHOP FLY is, more or less, a platform that is developing or delivering these capabilities through five different stakeholders. There are five different portals and we are just preparing some new ones, and these portals are active mainly in Europe, currently in six countries, and yeah, this is how it works.
Mirek Malecha: 00:12:40.407 So let me show a short demo of how the WEB SHOP FLY works, and then we will continue with the monitoring, so let me share my screen. So this is the main screen. As I already mentioned, there are some fields that are quite obvious in all air ticketing systems, but here, additionally, we have a menu of the options. So here, for example, I want to fly from London, and here I have a full palette of options where I can fly, including the prices. It can be sorted according to multiple criteria like popularity or, I don’t know, distance or I would like to fly as far as possible, for example, or according to price, so the cheapest possible flights from London. I can also select some filters like I would like someplace where I can be close to the sea, someplace where it’s very suitable for parties, so here I have the offer.
Mirek Malecha: 00:13:45.300 So let’s say I would like to fly to Barcelona because this is the cheapest one from London, and it is aligned with all my requirements, and here I have the full list of flights. I can select specific one as well as, in case I would like to just monitor the specific location, I can also build some watchdog where I can enter my email address and the system will actively notify me once the price is interesting, related to the specific location. Let’s say this is the flight that I would like to select, so I can click and select this flight. Currently it is connecting the online system of the airline, and, as you see, the price is still US$46, as it was, but sometimes it may happen that if an airline, during the time of the ordering, will change the price, the price is changed here as well, which is one of the issues that I’m going to focus to later on. So this is, more or less, the demo of the WEB SHOP FLY. It’s a very, I would say, simple interface where you can very quickly select the right flight for where you want to fly.
Mirek Malecha: 00:15:13.937 So let me return back to the presentation. So what are the main challenges? As Daniella already mentioned, we wanted to build something that would be fast enough, because the feedback from our customers or potential customers was that the typical systems are pretty slow. So our SLA, the median should be seven seconds and less, of course. The second one is because there is a high competition on the market, deliver best prices, which means that we have to be very flexible from integration point of view, be able to connect the application with all the possible systems that are available. And the last one is deliver fresh prices. This is one of the issues that we have, and I will mention this later on, be able to always offer, in the menu, the prices which are currently available. The prices are changing all the time.
Mirek Malecha: 00:16:24.700 So let me briefly describe the ecosystem where the WEB SHOP FLY is integrated. So there are many, many integrations because the WEB SHOP FLY has to be integrated with airlines. We are also using content providers and aggregators which are aggregating the airlines into some more common interfaces. We are also collaborating with meta search engines, which are just searching the flights like, for example, Google Flights, yeah, so we can search there the flights, but the flight itself is ordered through the specific system which is linked with Google Flights. And, of course, there are various technologies because these APIs from these systems, especially in the case of airlines, they are using legacy systems, so traditional SOAP APIs, REST APIs. We are transferring files. We are connecting various protocols. It’s very challenging from this point of view.
Mirek Malecha: 00:17:32.548 The second issue is that the prices are changing over the time and very quickly, yeah, in minutes. So we were capturing around 500 million flights that were changing each 10 minutes, yeah, so it’s something which is very challenging to show some offer of the flights if the prices are changing. So we have to somehow balance the accuracy of the prices, the search time because we can’t search all the time through all live APIs, and of course, the cost, because the API cost to our partners cost money. So this is something where we are using multiple solutions, and the monitoring is the key one. So the first one is the business monitoring because we have to select the flights which are attractive. We don’t need to capture all 500 million flights with all the dates and possible time frames. The second one is that we implemented very smart integrations with our partner. It’s not just about the technology, but also how to [inaudible] effectively with caches, which is connected with the application performance monitoring part that is measuring the quality of the caching algorithm and helping us to improve it all the time.
Mirek Malecha: 00:19:06.475 So let’s start with monitoring. This is the example of a list of dashboards we are currently using. We can separate them into three groups. The first group beginning with A is linked with application monitoring. The dashboards with B are the dashboards which are visible for our stakeholders and business owners, and we have also infrastructure monitoring which is focusing to the machines, databases and so on. I will prepare - or I have prepared some demo that will focus to some of them because we have limited time. I think I can’t cover all of them, so let me jump to the dashboard itself.
Mirek Malecha: 00:20:09.628 Okay. So this is the dashboard. And let me start with the business monitoring. So I will open the pricing events, which means here I can see the amount of requests that are asking for price. If you remember my demo, once I clicked into the specific flight, the system connected online the specific airline in order to validate the price. So here we can see the amount of requests, and this is the green part. The orange part is showing amount of errors, because even when we ask the airline, very often, we are getting some error codes, so we have to handle this, and here we are measuring the quantity of these issues. Of course, if I selected or select specific time, I dynamically see the flights below the chart, as well as the list of the issues. I anonymized, a little bit, the view, in order to simplify the demo, but there are some more detailed information. The second chart is showing the response time of the search. So here you can see the maximum time, which is the blue one. We have the mean time, which is our SLA, which is purple, and there is, also, minimum time, which is, yeah, very small, typically, under one second. So these are the metrics that we are capturing, which are linked with the searching capability.
Mirek Malecha: 00:22:00.175 The next step is the order. So once I - sorry. Once I order the specific flight, here I can see some metrics. So here I can see the list of metrics, list of orders or amount of orders over the time. I can even zoom in into the specific time, so here I can see how many events do we have here. So there are, for example, two events, and as you see, here we have the list of orders. Here we have the failures that we detected, and we have here some additional information like what is the airline, what is the system that has been used for the ordering, as well as this is the flight from Prague to Madrid. So here we can see some details. Again, it’s very helpful to be able to capture the information and get the context about the orders, errors which are coming from completely different systems, so here we have a linking of all the events together.
Mirek Malecha: 00:23:11.730 So this is the business monitoring related to the orders. I have also the monitoring related to watchdogs. So here I can see the list of notifications. We have to wait. It takes some time. So here we have the list of the emails that we have sent to our customers that have already registered the watchdogs that can notify about the price change or the better price of the specific flight. We have, also, the list of registrations that shows how many registrations we did and what’s the duration of those specific registrations, so what is the response time. I can see, also, the amount of registration over the time. And here is a very interesting chart which is showing the adoption of the watchdog that is providing the information. So we started in January this year, and as you see, it’s growing constantly. So this is another metric which is showing, also, the adoption.
Mirek Malecha: 00:24:25.225 Now let me focus more to the business infrastructure monitoring. Here we have quite typical monitoring of the database. All the charts here except the first row are the standard charts. If you configure Telegraf, you will get them out-of-the-box. We extended this using some custom connectors that are providing information regarding the availability of the database. So there is just one or zero. One means the system is available. If this is zero, it means that the system is not available. So we extended this to be able to understand whether the system is working well or not. So this is the demo related to the monitoring. So let me continue with the architecture. This is the architecture diagram. On the left side you can see all the types of the services provided by WEB SHOP FLY. There are business services, there are application services, and some technology services, which are all linked all together, and all these layers are connected with the InfluxData stack.
Mirek Malecha: 00:25:57.034 We are using Telegraf in order to capture data. We are using the out-of-the-box plugins that are available. We also extended this by our plugin like the one that is producing the pink metric related to MongoDB. We are using JavaScript client that is capturing the information from our application. We are using Java client where we are using Java code. We are using, also, Kapacitor, and later on, I will show you some diagram related to the Kapacitor, and of course, you saw the Chronograf dashboard that I was using during the demo. The whole stack is integrated with GitLab for authentication of the users that connect to the system. This is an example of the JavaScript code, how we generate the events into InfluxDB. So we prepare the message, we add some additional information we want to have in the monitoring, and then we send the event into analytics class that is responsible for all events. I also extrapolate some key commands from the analytics class here with the connection to InfluxDB, and then once the event is created and properly formatted, it’s written into the database. So this is the way how we capture events from various sides of the application and capturing them into InfluxDB where we can generate the metrics as you saw in the demo.
Mirek Malecha: 00:27:52.792 We are also using Kapacitor. The Kapacitor is used mainly for alerting. This is the booking order process that is done through web interface. So the clients are creating a new order. So the event is pushed into InfluxDB, then it’s processed by Kapacitor service that creates the order - or use the order notification structure to push this into our back-office system. The back-office system contains, also, Slack, where we can get - or our back-office people will get some additional information about the new bookings, like you can see here, or some booking error in case something doesn’t work well. So we are using InfluxDB also as a source of the events for alerting into the back-office system.
Mirek Malecha: 00:29:02.864 Well, why are we selecting InfluxData? The first reason was very fast adoption. So we were able to create quickly the events, we were able to push them into InfluxDB, and we were able to very quickly show the analytical views from the InfluxDB to various stakeholders, beginning with infrastructure to the applications, so people that work for us that were responsible for the specific components were able to create the specific views they need to be able to support, maintain, and improve the application. The second reason is very reasonable hardware requirements. Even the current version of InfluxDB is running on AWS. There is the machine specification. It’s pretty effective in case of the compression of the data, so even when we have a huge amount of events and additional information, it’s still possible to manage it very effectively.
Mirek Malecha: 00:30:16.370 Another reason was we are actually capturing the information from different technologies, beginning with the libraries of the application. We are capturing the information about technology, various databases, some information about the hosting of the machines, and so on. And there are a lot of predefined connectors that we can leverage and we can quickly use to capture the information, as well as we are using some third-party services mainly related to the user adoption like Google Analytics, Hotjar, and so on, where we can also use the metric from these systems and events to capture them and link them with other events that are created in the WEB SHOP FLY. So it’s suitable from a technological point of view as well as we can capture the various types of events beginning with the information about orders. We can measure the user adoptions and combine all these information together. We can link his with really technical information like MongoDB, which is used as a primary storage for caching of the flights, as well as there can be very low-level information like network traffic and so on. So the full stack is very suitable and very easy to integrate and implement and get the right views to get the right insights and to make the right decisions based on the information that we are getting from InfluxDB stack.
Mirek Malecha: 00:32:11.905 So this is the summary. Just briefly introduce Bonitoo company. Except WEB SHOP FLY, we are delivering some other projects. We are a development company that is covering, end-to-end, the whole R&D. So we’re using InfluxDB for our other projects as well and the main reasons as we mentioned here in the previous slide. Also, based on other experience, we became an InfluxData consulting partner. We are also helping InfluxData to deliver and develop some libraries that we were missing, and currently, yeah, these libraries are available. We are based in Prague in the Czech Republic. We have, also, office in Vietnam, and we have some other offices around the Czech Republic where we are developing the solutions. Okay. So that’s all from my side.
Daniella Pontes: 00:33:28.454 Thank you very much, Mirek. I would like just to wrap up this presentation. I would say that it’s an attitude change in terms of viewing monitoring and performance monitoring as something that can be completely prepackaged for you. I think at the end of the day, you are the one that knows what you need to do or where you need to improve and what kind of data would be useful for you. I’ve seen a lot of situations where you get into that point of data intoxication because there are so many sources of monitoring data just being collected and in-silo solutions where you need to do a lot of correlation between [inaudible] from one thing that you’re measuring, sometimes the same thing in another system. And to get this bigger view, you need to collect data from different systems and try to find a way to make sense out of it.
Daniella Pontes: 00:34:45.579 So I think from InfluxData, this project from Bonitoo is very iconic because with one platform they could manage all the data that they needed in order to achieve that competitive edge, metrics, events, business indicators, everything in one platform. And I believe that makes a difference in terms of how can you use data from different players, I would say, to extract more actionable information, not just data, but, actually, something that means something to you and will lead to a workflow that is more productive. Thanks, everyone.
[/et_pb_toggle]