Posts Tagged ‘Platform’

Apple iPhone Event – Cloud Adoption

October 3, 2011

By Varun Parmar, Director of Product Management, YouSendIt

Like many others across the world, I will be tuning in to Apple’s event tomorrow. The imminent release of the iPhone 5 and iCloud is great news for the industry, sparking a significant spike in the adoption of cloud services. Instead of tens of millions of people using online content storage, millions or billions of people will be doing so. As a result, more and more people will become comfortable with storing and accessing content from the cloud.

Why?

Apple will integrate iCloud seamlessly into its existing platform, the platform already used and loved by millions. So it won’t be long before accessing content from the cloud anywhere at any time becomes second nature.

Apple has designed an ecosystem that works perfectly within their own applications and devices. Google generated buzz around its Google Drive available on the Chrome OS last week. We can expect Microsoft to do something similar with Windows 8 and Windows Mobile. Clearly, each of these platform vendors will focus mostly on their respective desktop+mobile platforms – Apple (OSx + iOS), Google (ChromeOS + Android), & Microsoft (Windows + Windows Mobile).

However, people do not live solely in one ecosystem. It’s obvious to us that people are carrying an array of Apple, Google, and other devices at once. Even if you are on one platform, you work with vendors, customers and partners who operate on platforms of their choice. Say you work at a creative agency. You design on a Mac and save your projects to iCloud, but your client works on a PC. So when it comes time to get your client’s input or sign-off on the project, you’re stalled.

We live in a heterogeneous society, where you need cloud applications that work with all devices and all platforms. They also need to be business friendly, meaning you can use the application inside and outside the office. The ability to access your data anywhere at any time is affecting the speed at which we conduct daily business. We can now expedite business because we have the ability to, for example, send and sign a critical document outside normal business hours on your mobile phone. The future is going to be increasingly about the cloud applications you use on your device instead of the device itself.

Design of the YouSendIt Platform – Part 7

November 8, 2010

Implementation

New Process

YouSendIt management was very cognizant of the complex nature of this project, and set up a separate project (from the main Platform launch) to come up with finalized product requirements, migration strategy, and new and improved SDLC practices (internally code named as ‘Super POC (super proof-of-concept)’ project). This also helped provide necessary and critical team building before committing to an aggressive platform launch schedule.

The entire SDLC was rethought, and new processes were built to do unit testing, and code coverage for every platform build to ensure backward compatibility, and maintain minimum quality standards. The team customized SCRUM development methodology to come up with predictable releases of the platform being delivered to the application team. A typical sprint for the team took a total of three weeks – two days for planning, ten days for development, two days of integration and one day for testing at the end.

Launch

The new YouSendIt platform was launched in Q1 2010 as per committed schedule. Despite having lots of moving parts – a whole new network infrastructure, new monitoring, and new product – we are happy to report that the platform launch project was able to meet the exact release date! The results of using the new SDLC were also very clear – it was interesting to observe that the bug count for the entire platform project never went above 20 bugs at any given time!!!

Summary

Designing, and being part of building the YouSendIt platform was a great learning and satisfying experience. The new platform allows for easy extensibility to future new technologies of graph and social networks, big data problems of sensor networks, adjacent services like electronic signatures without sacrificing scalability, reliability of the overall system.

As of writing this paper, after 6 months of the new platform being rolled out and in active use, we are highly encouraged at the reliability of the new architecture in place – there has not been a single instance of the new platform not able to service users (common initial occurrence in other similar large system rollouts that we read or hear about in the industry).

Operations team has a formula for scale out and capacity planning. Additionally, the Error Console provides daily feedback for any non handled errors happening in production (without scouring or parsing any logs) – this aids the team in addressing issues before they become bigger.

The development team is able to focus on new product creation, and not debugging/ patching the system!! YouSendIt development teams are now confidently and aggressively planning to add new features to its service portfolio.

Key takeaways:

  1. In a growing company, study and analyze the system in place. This will provide valuable lessons and experience.
  2. Gather requirements and desire from key stakeholders and management. Certain expectations might be contradictory from the system you are thinking about.
  3. Take accountability and responsibility for coming up with a solution which synthesizes the varied point solutions and satisfies all requirements. You might be surprised at some of the goodness that may come as a side benefit!
  4. Big projects which change inside out have a lot of moving parts. Don’t be shy in coming up with newer process and methodology for execution.

We are confident that this new platform will power a new set of value added applications for YouSendIt users faster.

Read Part 1, 2, 3, 4, 5, 6

By Sumeet Rohatgi

Design of the YouSendIt Platform – Part 6

October 29, 2010

Putting it all together cont’

Flexibility, Agility, AND Scalability

Choosing the right language for the platform was imperative, and the team chose JAVA language. There were immediately a lot of tools available for enhancing developer productivity that could be leveraged: IDE’s, Libraries, Profiling, Unit testing, Code coverage, Performance testing tools etc. To address IDE development, the platform was designed to be deployed in two ways – one tuned for operations (where web service calls were out of process), and the other tuned for developers (where all web service calls were made in the same process space of the application server) – this allows IDE based debugging to be much more useful.

All of the previous mentioned functionalities were physically packaged into what is called a Platform Engine (depicted in Figure 3).

Figure 3: Platform Engine detail

Interesting features/ properties about a Platform Engine:

  1. An engine connects to multiple read databases, and a single write database
  2. An engine capacity protects (limits) concurrent read and write web service requests by rejecting any connections above a certain threshold.
  3. An engine can be co-hosted with other engines (useful for developers to deploy to local machines)
  4. Engines can be built out of different source code trees (useful for distributed development)
  5. Engines can house other SaaS cloud offerings (adding capacity protection and abstracting away the SaaS offering)
  6. Engines are configured from a single XML configuration file at install/ runtime
  7. An engine provides its list of API using industry standard WSDL definition files
  8. An engine adds layer of additional authentication and authorization for any web service calls it services
  9. Multiple engines can be clustered at runtime for adding capacity at runtime

Runtime Efficiencies

Operations friendliness was achieved by having multiple specially designed features on top the basic Platform Engine building block. Standardized logs were inserted automatically at strategic points by auto code generation so as not be too dependent of developers to do the right logging. In addition, an Engine is also able to monitor its own health and publish errors to a specially designed Error Console engine at runtime. The Error Console engine database could then be monitored at near real time for any system wide errors.

As noted previously, configuring web and application servers is often an afterthought, resulting in complex unmanageable systems. The team very early on decided to have simple XML file for a single module in the system, and use that to generate other application configuration files – for logging, application container etc. This XML file was the same per module for all machines. At install time this file was expanded to generate all other necessary local configuration files needed. This drastically reduced understanding required for getting an environment up for running – the software was able to inspect the environment and follow the guideline in the XML file for generating appropriate configuration files. XML is the lingua franca of web services, and this enables a possible future enhancement of having centralized configuration servers – leading to powerful automation capabilities for spinning up whole new instances of application servers on demand.

To meet scalability objectives, the team decided that data would be housed in multiple databases. For flexibility, independent application modules would be reading from these different databases. Further, the team decided to ‘capacity protect’ write database transactions separately from read database transactions. This also had the benefit of being able to run batch jobs as a list of web service calls any time! Why? No more messy night time batch jobs required.

To aid system wide debugging, the team also designed Runtime Tracing. Tracing was designed to be turned on demand (from a GUI) at various levels – machine, service API, user, and account. This GUI tool could also be leveraged by customer service to help reproduce those knotty customer can only do this from his/ her own machine issues!!

Read Part 1, 2, 3, 4, 5

By Sumeet Rohatgi

Design of the YouSendIt Platform – Part 5

October 14, 2010

Putting it all together

The team was now tasked with synthesizing all of these considerations and possible candidate solutions into a cohesive whole which would satisfy all the requirements. By deeper analysis of the problem space, it was clear that point solutions would not take care of all the requirements.

Reliability

The team studied reliability theory, and came up with a principle of capacity protection to handle extreme and uncontrolled load patterns.  Research guided us into using Message Queuing as a way to guarantee processing (reliability). Typically, messaging capability is reserved for offline processing, where real-time response is not required – but the team realized that this was where innovation and fresh thinking were required!

The synthesized solution leveraged SOAP’s extensible XML message packets to travel over both HTTP & JMS transport layers. For reliability, all web service calls capable of modifying persistent data were now traveling within messaging queues. However, this alone would not be a strong guarantee of reliability as data modification services in many cases used data read services for validation or processing. So the new platform was devised to be capable of rejecting connections after a reaching a predefined threshold. Well defined error codes for rejected connections allowed for automated replay of queued transactions at a later time. This also led to designing data modification using an UPSERT based model:

This algorithm and data design helped avoid race conditions arising due to modification web service call sequence mismatch. In the absence of such an algorithm and data design, failures in updating information would lead to corruption of data being saved in the database. An example will help. Let three atomic transactions be create a user, update email, update password – depending on the application, these operations could be coming in parallel. Here we utilize the ACID properties of the database to lock the required record, and do the update in place. If the record does not exist, then insert a new record. This also required update API’s to contain input information to be able to insert record also.

Performance

Network latency considerations led the team to have a chunky API design as opposed to chatty. Chunky API design involves producing large enough outputs from API’s so as to minimize back and forth traffic from client user interfaces. A rule of thumb is that one API call should be enough to render an entire webpage. However, introducing even small amount of latency in making ‘synchronous’ web service calls (such as payments, registration) would not be acceptable to YouSendIt users. For these types of transactions, the team utilized distributed caching to proactively load such objects into cache, while the actual message was still in the message queue. This cache was always updated on any modification API calls (in effect creating a cache fronted database).

Read Part 1, 2, 3, 4

By Sumeet Rohatgi

Design of the YouSendIt Platform – Part 4

October 4, 2010

Understanding the Problem Continued

Uncontrolled System Growth

Big data growth was another concern that YouSendIt was experiencing – resulting in having wide and long tables strewn all over the MySQL RDBMS system – which in turn led to insertion hot spots and high latencies in fetching of list data. Exploring the landscape, Oracle RAC was being touted as the answer to a developer’s prayer of “do not worry about data size growth!” Unfortunately, this was not a wallet friendly option for YouSendIt.

System capacity planning was a huge problem – since YouSendIt traffic could spike based on popularity of the file getting exchanged by registered and non-registered users. In addition, ongoing server reboots, client reconnects, disk drive corruption, database crashes, complicates the issue further. When systems crashed and connections failed, we wanted to ensure full redundancy and fail-over, with guarantee that committed transactions would complete. However, naive implementation of using a messaging queue for all transactions result in huge time and processing overheads – not acceptable to YouSendIt users accustomed to split second responses.

Another often ignored aspect of large system design is configuration. Even a simple web server in a data center requires a host of configuration properties which need to be set exactly or else repeatability and predictability of response and monitoring suffers. Managing application configuration is often a very poorly designed, and defined activity. The team recognized these problems and was determined to solve them.

Declining Productivity

Developer friendliness on the other hand was all about reducing developer time spent on understanding the entire system, and more time on actually solving the problem at hand. It was also about helping detect any side effects caused by addition/ removal of features to an already running system. IDE based development is an industry accepted way to enhance developer productivity. However, in server based programming this can result in developers putting in fewer logs in their code (due to the availability of interactive debugging). Moreover, IDE based debugging has limited use in web service environment due to the nature of out of process service calls between services.

There was also the desire of rapid integration of third party data and cloud services to add more value to the YouSendIt product portfolio. How would one easily add potentially unreliable services without sacrificing scalability, reliability of overall YouSendIt service platform?

Even if the team were able to design this new platform addressing all listed concerns, they would have a hard time controlling quality after shipping a large distributed and interconnected system. Constant code base churn due to feature additions also bring in many side effects which are found out by costly QA cycles. Traditionally, over time, large systems become slower, harder to debug, and almost impossible to add features on to. In fact, this situation would be very similar to the system that was getting replaced!!

Read Part 1, 2, 3

By Sumeet Rohatgi

Design of the YouSendIt Platform – Part 3

September 28, 2010

Understanding the problem

Philosophy

One of the key strategies used by the team was to think through the problem space as a whole, and not in pieces as is the first reaction (“let’s break the problem into pieces and solve each separately” – divide and conquer). In our opinion, such solutions tend to be piecemeal, and not really address the situation in ways which are complementary. For example solving scalability concerns and flexibility separately would result in a complex system tied together through lots of documentation and user guides trying to mitigate each other effects.

Exploring Choices

Initial discussions were to mold existing (LAMP) programs to fit the requirements. However, there was immediate ‘old baggage’ mentality that came along free for the ride. new processes, designs, and ideas would get smothered with – “we already tried that, it did not work!” Also, in our opinion, PHP as a robust backend language had some real limitations (like database connection pooling, thread management, code coverage toolsets etc.) where reliability and robustness were key requirements. Scalability in such a system would also be suspect, and would if not handled carefully, impact developer productivity. So even though this seemed to be the easiest path to take, the team decided to move on to considering further options. As languages choices go, out of the remaining ‘C’ syntactic style languages, the choice was between C, C++, Java, and C#. We limited our choice to ‘C’ style languages so as not to shrink our talent acquisition pool from the market. C & C++ were ruled out due to high expertise and skill required for memory management; C# was rejected due to.NET framework being only available on Windows platform.

The nature of programming on the internet requires supporting web services – where API’s are either REST (Representation State Transform) or SOAP (Simple Object Access Protocol) style. Industry folklore is that REST is an easy way to integrate and create multi site mash ups, while SOAP is considered to have additional performance overhead. Our experience shows that there is little difference between the two styles systematically (both protocols are built using XML utilizing HTTP verbs such GET/ POST/ DELETE). SOAP has an added benefit of transport independence – which allows API’s to easily migrate from web transport (HTTP) to others like messaging (JMS). To denote using a mathematical equation, we can represent:

REST + conventions + toolsets = SOAP

The extra toolsets available for SOAP also increase developer productivity, so it was a no brainer to consider SOAP as the definition style for Platform services. So as not to lose the ad hoc mash-up and hacking simplicity of REST style web services – it was decided to auto magically generate REST endpoints for the SOAP services. Figure 2 illustrates the same web service being called in SOAP and REST styles.

Figure 2: Same web service called in by REST and SOAP styles (abbreviated for simplicity)

There are some well known downsides of using web services –performance and reliability are not easily achieved (point solutions can be put in place – but overall systematic solutions are not so intuitive) . Users are forced to write small state machines and use the database in order to have basic guarantee of multistep transactions. Another even more unattractive option is to have database resource managers perform two phase commits. This reduces developer productivity and puts more pressure on the database. In addition, DDOS attacks or high traffic typically results in server crashes, and users being shown a server down message (the dreaded 50x HTTP error code family) without a clear idea of whether their transaction was ultimately successful. Also, simple straightforward application of the principles of SOA (service oriented architecture) result in lots of RPC (remote procedure calls) over the network – as one web method calls another, which could in turn call other etc.

Read Part 1, 2, 4

By Sumeet Rohatgi

Design of the YouSendIt Platform – Part 2

September 21, 2010

Requirements for the new Platform

There was a high bar for this new Platform. After discussions with various management levels and key stakeholders, the team came up with the following high level requirements:

Be scalable, agile and flexible: buckling down under load was a fearful but all too recognized reality in the older platform, however there was desire to not allow scalability solutions to the detriment of ability to develop and publish new product features

Be reliable and high performing: specially challenging, unlike other web 2.0 startups, YouSendIt’s business was built on trust of handling user’s precious information (files, messages) securely and guaranteed to reach the intended recipient on time. At the same time, reliable systems traditionally bring an image of a ‘slow’ tortoise as opposed to faster rabbit – the team definitely wanted to avoid this situation

Be operation and developer friendly: as a system evolves over time (and especially when the feature suite/ components get added at a rapid pace), typical software development environments run into problems of managing quality –code related, feature related, architecture, and performance related. At the same time, putting in rigid processes for managing quality typically hampers developer productivity, and results in a bureaucratic environment where it takes ages to accomplish new things

Last but not the least – Be wallet and launch schedule friendly: As a startup, this was another goal that the team aspired to: be faster and cheaper along with being better!!

As anyone who has done software development for a while can imagine, this was a tall order. The ‘and’ in each of the requirements goes against industry acceptable wisdom for a typical software project. Negotiations in the software program management world typically go like this: “only two of the following three can be fulfilled: time, people, and scope!!”

Read Part 1, 3, 4

By Sumeet Rohatgi

Design of the YouSendIt Platform – Part 1

September 15, 2010

YouSendIt business has thrived over the years from its inception in 2004 to date. YouSendIt’s basic product – ad hoc person to person large file transfer on the internet has enjoyed tremendous popularity and growth. The product provides real user value while reducing IT operational spends. The product has all the benefits of a SaaS solution:

  • Business users can purchase and provision on the product on their own (single and team editions)
  • Browser is all that users need to access functionality
  • Desktop integrations make it still more easier and convenient for using the product
  • There is an Enterprise version of the product with features of central management and SSO capabilities allowing seamless administration from corporate IT

    Figure 1: YouSendIt’s initial LAMP based technology system

YouSendIt is a market leader in its segment and aspires to extend its product portfolio into adjacent areas of ad-hoc digital collaboration – like electronic signatures, document transformation, web preview capabilities, online real time collaboration etc. These new areas test YouSendIt’s technology platform capabilities of rapidly assimilating new technology and powering new products and applications, while maintaining existing scale of user growth.

YouSendIt’s initial and older LAMP technology platform implementation (pictured in Figure 1) was feeling all the pressures of a growing system not really designed for rapid growth and extensibility. The database and file servers were running hot, and development and operations engineers were constantly under the gun to continuously provide patches for maintaining system reliability and performance. Ultimately, this resulted in slowing down the pace of putting out new product features and YouSendIt’s position as a market leader was coming under risk.

There was a need to rethink the technology architecture and stack and tune it back to the vision. YouSendIt made a key strategic decision to invest in a technology Platform initiative. The main goals were to launch the company into the next stage of its growth – and increase velocity (increase number of features, decrease time taken) of new features to market.

Look out for more posts about the Design of the YouSendIt Platform!

Read Part 2, 3, 4

By Sumeet Rohatgi


Follow

Get every new post delivered to your Inbox.

Join 3,750 other followers