Archive for March, 2007

Amazon, grid computing, C3, WeoCEO

Commodity Computing Cycles (C3) and ETech

I was preparing for ETech and ran across Jeff Barr’s recent AWS blog.  He points to a number of interesting links, including WeoCEO’s new website (thanks Jeff!). 

One of the links he points to is David Berlind’s video on “Is it time to throw away your servers?“.  It was a highly entertaining video, but more importantly it clearly laid out the business case for why cluster and grid computer is going to revolutionize this business.  We must be channeling the same psychic hotline, because it mirrors the case I laid out in the Cycles in the Sky blog earlier this week.  (However, David’s is far more entertaining, with real numbers.)

Commodity Computing Cycles (C3) is a paradigm shift in business computing.  It is coming, and to be honest, I have no way to predict the impact of the change on efficiency and productivity in the business computer arena.  I do know that in order for it to achieve its potential, those of us focusing on cluster and grid computing have to deliver some sort of Service Level Agreement (SLA).  While David points to the cost advantages, what he did not point out is the lack of an SLA from Amazon.  Someone running an ecommerce site may willingly pay the additional money shown in David’s video for a traditional data center operation, if they can be assured of up-time and bandwidth.  Without these assurances, the dollar savings obtained by using a C3 solution may be given back in poor user experience or web client customer service.

That being said, I think that we (the greater community of Amazon Web Services and EC2 users) are working towards achieving reasonable service levels upon which we can build ecommerce solutions.  We developed our WeoCEO ISO because it was required in order to host our WeoGeo geospatial exchange on EC2.  There are other service issues, such as large file ingestion (imagine trying to push a terabyte size file up to S3!), but we are confident that these too can be overcome and solutions delivered to the community.  I truly believe that the revolution is here, and like any other paradigm shifts, there will be a tremendous opportunity for those willing to place their stakes in the ground to deliver solutions to those who follow.
 

On other notes, I will be taking my soapbox to ETech next week.  Find me if you would like to chat about such things as revolutions and paradigm shifts in cluster and grid computing, as well as geospatial technologies.

Background, Amazon, WeoGeo, grid computing

Cycles in the Sky

There is a revolution happening quietly in the development of web computing infrastructure. To those who have been involved in the development of large scale distributed computing, i.e., cluster and grid computing, the concepts and applications of the revolution are decades old. To the computation science community, including weather forecasters, climate change scientists, numerical ecologists, artificial intelligence experts, bomb developers, etc., these types of efforts have been at the forefront of super computing technology development. I have been involved in various types of cluster computers for oceanic ecological modeling, and some of our collaborators at Rutgers University are experts in the field of distributed computing for coupled atmospheric and oceanic modeling. However, for the average person, terms like cluster or grid computing have little or no tactile meaning. Perhaps a few could tie it to the SETI grid computing project, but even these few might not understand the implications at the average business or consumer level.

One of the favorite terms today for large scale distributed computing at the business and consumer level is “Web-Scale Computing”. You see it in the sessions for a couple of the O’Reilly Conferences (ETech and Web 2.0 Expo) mainly discussing Amazon Web Services’ (AWS) EC2 and S3 services. AWS is one of the first mainstream applications that put the power of cluster computing into the hands of commercial web application developers. With these services, and those that will surely follow, we as a society/culture/business community move one step closer to the concept of on-demand purchasing of computer cycles, and the development of markets in these cycles.

These services, what I will refer to as Commodity Computer Cycles (C3), are different from commodity computing. In commodity computing you are still responsible for assembling the components and the network of processors into a cluster for your distributed processing application. You are still required to pay for the power, cooling and maintenance, as well as, the personnel involved in development, care, and security of the systems. These expenses are upfront and continuing throughout the life of the business, regardless of total computational use. With C3, you buy the FLOPS or the storage space needed for your application, on-demand.

With C3, your business can then focus on developing better applications and services for your customers, rather than the development of the in-house infrastructure to rack, cool, and take care of your computers. If the outsourcing of FLOPS and storage makes business sense (which I truly believe it does), we should expect that the demand of C3 services will increase, leading to the building of more C3 infrastructure and therefore feeding virtuously into the creations of evermore efficient web applications and services. If the revolution seriously takes roots and spreads across the whole of the business and consumer communities, it will affect us all.

To bring this discussion back home, we at WeoGeo are trying to change the dynamics of quantitative mapping. Our own maps are terabytes in size, and require petaFLOPS of processing. We developed our web exchange and server application on EC2 and S3 for many reasons, including the costs associated with the growth in our computing needs and the requirement to automatically scale as a function of computing cycle demand. In addition, by developing on a fully scalable C3 model (see below), we could pass the infrastructure savings directly to our user community. This should help enable them to develop new markets for their mapping products and hopefully lead them into a new model for generating revenue in the field of geospatial maps, services, and technologies.

The EC2 version of C3 marks the beginning of the widespread commercial use of on-demand distributed computing. I believe it is a harbinger of things to come. For our purposes, EC2 was not quite ready for prime time and we had to overlay additional intelligent management software to provide stability and optimized scaling to take full advantage of the C3 potential offered by AWS (see this WeoCEO blog post, as well as this AWS forum post by Robert Banfield). I am sure that our solution is but one of the first of many to come. The important thing to recognize is that the delivery of scalable, fully optimized, Commodity Computing Cycles is happening right now and will only get better, easier, and cheaper with time. I believe that the next phase of productivity enhancement in the business and consumer markets begins now, and it is truly exciting to be a part of this wave.

Background, Remote Sensing, Amazon, geospatial

Whether it is $3.6 or $7.0 Billion, it is still a big market

I ran across a recent post by Roger Hart at GeoCarta that highlighted a remote sensing market report (BCC Research) suggesting the total world-wide market for remote sensing products was on order of $7 billion in 2006. This number is similar to the $3.6 billion for 2006 estimated by Daratech, if you remove weather forecasting and climate change studies from their 2006 estimate.

These are big numbers. However, the total remote sensing and geospatial market are also segmented, with lots of niches that make it difficult for developing economies of scale in the collection of data, or the creation of derivative products.

I have a sense that this is changing. In other words, that the growing demand for products will run right into the ability of individuals to create content using base maps provided by large scale mapping projects (e.g. NAIP). I believe that we may be approaching a cusp period in the development of geospatial markets, where the benefits of low cost powerful servers and commodity computing (a la Amazon Web Services EC2/S3), combined with the robust open source geospatial software (e.g. GDAL) and the innovative power of individuals and small businesses, will begin to impact the traditional government services model. I see the impact to be greater supplies of content at lower cost points, resulting in an ever increasing demand for geospatial products.

I am not quite sure who wins or loses in this period. I would like to think that a rising tide raises all boats. I do think that it will be a period of rapid change, so if you are doing the same old thing, with the same old tools, it might be time to reassess your business model.

Amazon, WeoGeo

WeoCEO – How to use the true power of Amazon Web Services

As mentioned earlier, Amazon Web Services (AWS) is offering an innovative solution that effectively provides a flexible outsourced data center for web services. The Elastic Computing Cloud (EC2) offers access to scalable computing power, with a “pay as you go” approach that allows users to increase and decrease their usage without penalty. This new concept eliminates upfront costs and offers incredible flexibility, but also has some limitations.

These limitations are significant. The most critical issues for EC2 are dynamic IP addressing coupled with a lack of a 24/7 service level agreement. Amazon’s current service agreement does not promise 24/7 operation of an AMI, so when problems bring down your web site’s AMI, you’ll also lose your web site due to the loss of a valid IP address for the “A” domain name record in your DNS service. Not only will you have to restart your AMI and all web services, but you must also repopulate the global DNS tables with the new “A” record.

WeoCEO is a proprietary application originally developed by WeoGeo to manage the use of EC2 in serving WeoGeo clients. This solution has already brought affordable scalability to WeoGeo and its own Web 2.0 applications, and is now being offered as a private beta product for developers of AWS applications, in order to enable others to tap into the true power of AWS.

The WeoCEO application, working within the EC2 environment, eliminates the above mentioned limitations, and maximizes the power of EC2 by providing automatic and instantaneous scaling, load balancing, and fail-safe supports, including a stable IP environment. These critical functions optimize usage and provide true 24/7 operations to make EC2 a powerful, intelligent solution for businesses of any size.

The EC2 model allows scalable capacity to accommodate anticipated changes in traffic levels and growth, but management of this is labor intensive and requires personnel to oversee traffic levels. WeoCEO is an intelligent manager program that fully automates those tasks, and provides efficient usage and appropriate capabilities to handle growth, cyclical needs, and sudden spikes in demand. A sudden influx of traffic generated by something like a TechCrunch or Digg article can cause catastrophic failures at the exact moment a new site is trying to capture users. With WeoCEO, the increased demand is automatically addressed with increased capacity to handle the load without delay. And when demand decreases, WeoCEO eliminates the excess capabilities, to eliminate excess costs.

WeoCEO also provides the critical fail-safe support in case of failure that will ensure true 24/7 operational capability. When problems cause a temporary loss of your website, WeoCEO’s automated system will retrieve a duplicate image, and have your site available again within moments. With redundant systems that automatically regenerate and provide a stable IP address environment, your site’s functionality is maximized.

Background, Amazon, WeoGeo

Private Beta Launch

We made it into private beta yesterday. The infrastructure development on Amazon Web Services (AWS) took a little more time than anticipated, but more on that later. The home page at www.weogeo.com should give an indication of what we are trying to accomplish. This page will change shortly, as we move through the private beta period. What can be seen is that we are setting up a place for buyers and sellers of mapping products to exchange their maps (called WeoGeo Exchange), as well as a place for them to talk about their maps, to discuss new developments, and to define definitions and concepts that are important to the geospatial community (called WeoGeo Community).

We are going to shake the system down for a couple of weeks. The technical hurtles that we cleared (hope to have cleared) in the development of this site have been large. Providing the ability to exchange terabytes of maps in an easy, accessible manner, which includes map searching, sorting, digital rights management, financial transactions, and map delivery, all on top of Amazon’s EC2, which is still in beta itself, has been a challenge. We think we have done it, and are now ready to start showing it.

If you are interested in seeing and discussing what we are doing, please register. We are passionate about this field and the technology, and look forward to working with all those who share our passion.