май 30, 2019 Open the Gate for Healthy Apps with Cloud Performance Testing Imagine spending hours working on an app only to have it perform poorly and crash when too many users are using it. Enter: Cloud Load Testing (CLT). Стефан Шопов Imagine spending hours and hours working on an application, putting enormous efforts into its development and polishing every tiny detail of it. Functional tests are going well and your app seems to be doing a great job. When launch day comes around though, that pat on the back you’ve been longing for is nowhere in sight. While features and performance levels fare well in pre-production, the application simply crashes when real-life users come along. It appears the system can barely withstand the load of a large number of users at different locations, which naturally leads to sluggish page response, errors, timeouts, etc. Together with that, you get another harsh slap — you actually lose users, driven away by the poor performance of the app. This is where true failure emerges. Cloud Load Testing to the Rescue Such unpleasant surprises are by no means exceptions in the development lifecycle, since in-house tests alone are ill-equipped to provide realistic insights on how a program will react to an increased user load in practice. Lab tests are only able to simulateparts of the production environment. Nonetheless, a cure for these woes does exist. “Sunshine due is just a cloud away”, as the soundtrack of a favorite animated movie goes, and so to is the sound performance of your application. In other words, cloud environments can test your system for load before it goes live. Cloud-based load testing (CLT) allows for the examination of how an application performs under heavy load in real life. Cloud environments can mimic the presence of a set number of end users at preselected locations around the globe, so you don’t need to build your own infrastructure to carry out a down-to-earth assessment of the workload capabilities of the program you’ve developed. The Benefits of Cloud Load Testing Before we move on to outlining the advantages of cloud load testing, we should point out an important caveat. Whereas the cloud-based method can create a more authentic validation environment, you shouldn’t eliminate internal tests. Rather, the two approaches should go hand in hand so as to test the system both inside and outside of the firewall. Now, let’s get back to CLT and how it benefits the testing process. Simple — Cloud load testing is easy to use and adjust. To run it, you only need to provide a simple script (like the JMeter script) to the cloud tool and set a number of parameters. Then, hit the run button and the cloud will take care of the rest thanks to all of its built-in resources. Quick — Thousands of hypothetical users in the cloud environment put load on your application and in a matter of minutes, the test is done assessing how the system behaves under heavy load. Global Scope — The most essential difference between cloud-based testing and traditional load testing is that cloud platforms can mimic user load from numerous locations around the globe. Thus, testers have an overview of how the app will perform, should machines at various locations access it simultaneously. Configurable —Running a workload evaluation on a cloud tool gives you different options to adjust the test. You can choose to segment the nodes and bundle them into groups depending on, for instance, device type and operating system. You can change the variables based on your needs, as well as decide to keep them if you want to repeat the test at a later stage. Affordable — Another value of CLT is namely the value you get for the money you spend. In general, load testing is very affordable. Licensing plans often follow a pay-as-you-go or pay-as-you-use principle, so you’re not obliged to dedicate resources to cloud testing if you only need to do it occasionally. What’s on the Cloud Vendor Market? One of our clients, a video streaming service, needed to have its system load-tested so we researched the cloud vendor players that are currently on the market. The market is comparatively compact and consists of technology veterans (Amazon, Azure, BlazeMeter) as well as smaller startups (Flood IO and Testable IO.) After omitting platforms that give insufficient information or don’t support the JMeter script, we got the following five options: Flood IO BlazeMeter OctoPerf Testable IO Azure DevOps With this shortlist at hand, we scheduled live demos with four of the platforms above to help us pick the most appropriate one for a load test of our client’s streaming application. Here’s what we found: Flood IO — You can upload your JMeter or Selenium test plan into this tool, or build your test using their graphical user interface (GUI). On the plus side, you can run a load test with an unlimited number of concurrent users for free within 5 hours. Respectively, you can use Flood not only for website performance testing but also for DNS, API, etc. load testing. The tool gives you a real-time timeline with concurrent users, response time, network throughput, latency, and transaction metrics. Advantages: Variety of geolocations Good charts and analysis Nice support Free demo and consultation offered Disadvantages: Very limited trial Relatively expensive Example: $149 is the cost for tests consisting of 100 users distributed in 15 different locations. BlazeMeter — To run a load test on BlazeMeter, you only need to enter a URL and the number of concurrent users (a maximum of 20 is supported). No specific software required. This cloud platform isn’t compatible with scripted tests (while supporting JMeter) and it validates the access to just one web page at a time. The free demo doesn’t allow you to set multiple regions or test multiple steps in a transaction. However, BlazeMeter provides real-time results and a neat results page where you can have a more detailed look at the different categories. Advantages: Versatile and powerful Over 50 locations Customized plans Good customer support Free demo and consultation offered Disadvantages: Very expensive For example, $649 is the cost to test for 100 virtual machine (VM) users at over 50 locations. OctoPerf — OctoPerf is a cloud load testing tool based on JMeter and the Rancher load balancer. No installation is required, as you can run the test in your browser. Advantages: Eight locations Customized plans Good customer support Free demo and consultation offered Disadvantages: Relatively expensive Locations only in the US and Canada For example, €149 is the cost for 100 concurrent users at eight locations in the US and Canada. Testable IO — The Testable platform supports JMeter, as well as Gatling, Locust, Node.js, Webdriver.io, Selenium Java, Serenity BDD, PhantomJS, SlimerJS, record-and-replay, and HAR files. It employs AWS on-demand servers, allowing you to use your AWS account. Once you specify your performance requirements, Testable carries out the test and concludes how many concurrent users meet these requirements. You also get to know how the number changes across test runs. Advantages: Locations on most continents Cheap ($0.075/host/hour) Versatile pricing Prepaid credits Live results Different VMs to choose from Easy to use Endorsed by AWS Good customer support Free demo and consultation offered Disadvantages: Not very sophisticated metrics and reports For example, $15 is the price to test using 100 users from eight regions, that use one machine each. Azure DevOps — This tool offers a variety of locations and a free trial. However, we weren’t able to schedule a live demo with Azure DevOps. Since it doesn’t support the latest JMeter versions, the prices are hard to calculate, and its support is rather slow, we opted to not explore it any further. Here’s what we were able to find out though: Seventeen locations around the world Test duration can be chosen Number of agents can be chosen Latest version of JMeter supported is 3.2 Only HTTP samplers are supported Highest number of agents is 25 Multiple JMeter tests can be created with different load locations 20,000 virtual user minutes (VUMs) per month free Project has to be created and linked to Git in order to make the plan paid Choosing the Right Cloud Vendor Our client’s platform streams sports event both live and as video on demand (VOD) so, to carry out a load test, we had to zoom in on the most suitable cloud-based test service. Going through the functionalities of the five load testing tools above, we got a neat overview of the options they provide. Several features emerged as most important for the purposes of our case study. The testing tool should: Be cost-effective Support JMeter Be able to run tests from at least eight regions Have online support The only one that matched all criteria and also outperformed the others in terms of cost-effectiveness and customer support was Testable. We moved forward using it for our load test.. Case study As we already mentioned, our client streams sports events LIVE or later as VOD. Therefore, we needed to evaluate how the system performs in both scenarios, i.e. how many users can access the platform in LIVE mode and in VOD mode, respectively. Our ultimate goal was to prove that 100 users could watch LIVE or VOD streams without experiencing any lagging within the system. Nevertheless, it’s impossible to mimic the presence of 100 users with only one machine, so we created a simulation of many machines in various regions. In this regard, Testable did a great job. JMeter Script We prepared a JMeter script to imitate watching a stream from the back-end side using the web services. It is valid for both LIVE and VOD, but the parameters should be changed in order to switch between LIVE and VOD. The script consists of these pseudo steps: Get the playlist of the stream Extract the chunklist Get the chunklist of the stream Extract the streams Get the streams After uploading the JMeter script into Testable, we set the following resources on the target servers: LIVE: 32CPU 128GB VOD: 4CPU 16GB (current configuration) and 16CPU 64GB (new configuration) With these adjustments in place, we launched the load tests of the platform. Results and Analysis In terms of LIVE streaming, the tests gave satisfactory results for a server of 32CPU 128GB, meaning that the current configuration is able to support at least 192 users without causing additional lagging and interruptions. Here’s a detailed account: LIVE 96 Users The server we tried to hit has the following configuration: 32CPU 128GB The system was hit from eight regions on eight machines: N. Virginia, Oregon, Singapore, EU Ireland, EU Frankfurt, Sydney, N. California, Ohio. The machines are t3.small – 2CPU, 2GB RAM The median response was 156 ms. The throughput was 55 MiB/sec. The test was one hour long and at the beginning, it even showed over 200 MiB/sec No errors appeared The machines’ resources used to hit the servers were not fully used: Memory – 92%, CPU – 75% 192 Users The server we tried to hit has the following configuration: 32CPU 128GB The system was hit from eight regions on eight machines: N. Virginia, Oregon, Singapore, EU Ireland, EU Frankfurt, Sydney, N. California, Ohio. The machines are t3.small – 2CPU, 2GB RAM The median response was 335 ms. The throughput was 66.4 MiB/sec. The test was one hour long and at the beginning, it even showed over 200 MiB/sec No errors appeared The machines’ resources used to hit the servers were not fully used: Memory – 92%, CPU – 100%. It looks like this may have influenced the results 168 Users The server we tried to hit has the following configuration: 32CPU 128GB The system was hit from seven regions on seven machines: N. Virginia, Oregon, Singapore, EU Ireland, EU Frankfurt, N. California, Mumbai. The machines are t3.xlarge – 4CPU, 16GB RAM The median response was 242 ms. The throughput was 223.1 MiB/sec. The test was 15 minutes long No errors appeared The machines’ resources used to hit the servers were not fully used: Memory – 80%, CPU – 82% The VOD test showed that a configuration of 4CPU 16GB can handle around 48 users without additional lagging and interruptions. At the same time, the system is unable to support about 96 users, as it starts lagging more and the network throughput decreases. The results were satisfactory for a 16CPU 64GB server — it’s able to withstand around 96 users without additional lagging and interruptions. However, the system starts to lag more and the network throughput drops when the users get close to 192. VOD 96 Users The server we tried to hit has the following configuration: 4CPU 16GB The system was hit from eight regions on eight machines: N. Virginia, Oregon, Singapore, EU Ireland, EU Frankfurt, Sydney, N. California, Ohio. The machines are t3.small – 2CPU, 2GB RAM The median response was 20388 ms. The throughput was 21.6 MiB/sec. The test was 15 minutes long No errors appeared The machines’ resources used to hit the servers were not fully used: Memory – 92%, CPU – 95%. This may have influenced the results 48 Users The server we tried to hit has the following configuration: 4CPU 16GB, Network ~90MB/s, CPU ~90% The system was hit from three regions on eight machines: Oregon, N. California, Ohio. The machines are t3.small – 2CPU, 2GB RAM The median response was 1037 ms. The throughput was 77.7 MiB/sec. The test was 15 minutes long No errors appeared The machines’ resources used to hit the servers were not fully used: Memory – 91%, CPU – 78% 96 Users The server we tried to hit has the following configuration: 16CPU 64GB, Network ~190MB/s, CPU ~70% The system was hit from three regions on eight machines – Oregon, N. California, Ohio. The machines are t3.xlarge – 4CPU, 16GB RAM The median response was 933 ms. The throughput was 158.7 MiB/sec. The test was 15 minutes long No errors appeared The machines’ resources used to hit the servers were not fully used: Memory – 84%, CPU – 68% 192 Users The server we tried to hit has the following configuration: 16CPU 64GB The system was hit from three regions on eight machines: Oregon, N. California, Ohio. The machines are t3.xlarge – 4CPU, 16GB RAM The median response was 21212 ms. The throughput was 38.5 MiB/sec. The test was 15 minutes long No errors appeared The machines’ resources used to hit the servers were not fully used: Memory – 24%, CPU – 53% What We Learned We carried out the test of our client’s video streaming platform to see whether it can handle significant load during LIVE and VOD streaming. Thanks to JMeter and the cloud-based testing tool Testable, the client now has enough information about how many concurrent users the system can support and at what parameters. In this way, the company will be better prepared for the challenges of real-life usage of the application. Cloud Load Testing is a great way for us to supply our clients with detailed results of performance testing where significant load is expected because it allows for a simulation of load from different geographical locations at relatively affordable prices. Image Source: Hack Capital on Unsplash Тагове Quality AssuranceSoftware DevelopmentCloud & DevOps Сподели Share on Facebook Share on LinkedIn Share on Twitter QA Strategy Analyzing the current state of software testing and how teams must adapt. Download Сподели Share on Facebook Share on LinkedIn Share on Twitter Запишете се за нашия бюлетин Запишете се за нашия бюлетин