Posts Tagged ‘WeoCEO’

WeoCEO Emerging From Private Beta

Thursday, October 18th, 2007

WeoGeo has created a scalable, fault-tolerant infrastructure to manage its use of Amazon Web Services Elastic Compute Cloud (EC2) operations. I’ve written about it a couple of times (see this link for a listing of the Amazon tagged blogs). The latest version of WeoCEO (Version 0.1.0) is ready for release and with it we are moving from private to public Beta.

This version includes the Assistant to back up WeoCEO (see this feature described in this Amazon Web Services StartUp Event Slide Show). WeoCEO Version 0.1.0 also provides enhancements to the stable IP addressing, failure detection, and automatic scaling and load balancing. These enhancements include automatic emailing to your site administrator during trouble events and detailed logging capabilities.

WeoCEO Version 0.1.0 (including the load-balancing and auto-scaling capabilities) will be free of charge at least until December 1, 2007. It will continue to be free if you only use the stable IP addressing and auto-recovery features for a single client instance.

There will be a charge for the load-balancing and auto-scaling features of WeoCEO, which support running multiple EC2 instances and optimizing your network. The charge for these features will be $0.05 per managed client instance per hour. The charge will be on the average usage over an hour, calculated at <15 minute intervals.

You can obtain a WeoCEO ISO with the setup and installation instructions, by visiting http://www.WeoCEO.com and clicking the “Signup” button, or by clicking the Signup button below. We are still in beta, so constructive comments on any of the components that make up this service will be met with exuberance and free goodies.

Amazon Web Services EC2 Outage

Monday, October 1st, 2007

This weekend was a bit crazy for some of the AWS EC2 users. EC2’s “management software erroneously terminate[d] a small number of user’s instances” (from the AWS forum post). Some of our instances were among them providing an opportunity to test the fail-safe mechanisms in WeoCEO. We received the following email:

From: Amazon Web Services
Sent: Saturday, September 29, 2007 5:46 PM
To: David Kohler
Subject: Amazon EC2 Notification of Terminated Instances

Hello,

This is just a quick note to let you know that some of your instances were erroneously terminated today. We have resolved the underlying issue, and the service is fully available.

You can find a summary of the issue here:

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=6816

These are your affected instances:
i-8004e0e9
i-681ef101

We apologize for this inconvenience.

Sincerely,

The Amazon EC2 Team

Please be aware of the limitation of utility computing, as well as the promise. Planning for these outages will be a requirement for safely outsourcing your metal resources.

If we had not prepared for this by building WeoCEO, this could have been a real issue for us. We would have needed to scramble staff at 6 AM on a Saturday morning. Fortunately, WeoCEO recovered from the failure and it was not until Monday afternoon that we notice that it happened to a lot of other people.

From WeoCEO’s architect, Bob Banfield’s, forum post:

Here is a quick shot from our WeoCEO logs. We told WeoCEO that regardless of usage we want a minimum of two instances running, so that is the initial number of instances at 6am in the morning, even though we are receiving next to no traffic. At 6:09, i-681ef101 stops responding (the first of five allowed consecutive failures). At 6:10 it still hasn’t responded, and at 6:11 both it and instance i-52907e3b have now stopped responding. Instance i-52907e3b comes back up in another 2 minutes, but instance i-681ef101 is ruled dead after 5 failures. It is automatically terminated and a new one is brought up in its place.

(SSS) Sat Sep 29 06:07:24 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 2
(SSS) Sat Sep 29 06:08:25 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 2
(EEE) Sat Sep 29 06:09:25 2007 Weoceo[6562]: Instance i-681ef101 has not reported statistics (1/5)
(SSS) Sat Sep 29 06:09:25 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 2
(EEE) Sat Sep 29 06:10:25 2007 Weoceo[6562]: Instance i-681ef101 has not reported statistics (2/5)
(SSS) Sat Sep 29 06:10:25 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 2
(EEE) Sat Sep 29 06:11:26 2007 Weoceo[6562]: Instance i-681ef101 has not reported statistics (3/5)
(EEE) Sat Sep 29 06:11:26 2007 Weoceo[6562]: Instance i-52907e3b has not reported statistics (1/5)
(EEE) Sat Sep 29 06:11:26 2007 Weoceo[6562]: No instances have reported statistics.
(EEE) Sat Sep 29 06:12:26 2007 Weoceo[6562]: Instance i-681ef101 has not reported statistics (4/5)
(EEE) Sat Sep 29 06:12:26 2007 Weoceo[6562]: Instance i-52907e3b has not reported statistics (2/5)
(EEE) Sat Sep 29 06:12:26 2007 Weoceo[6562]: No instances have reported statistics.
(EEE) Sat Sep 29 06:13:26 2007 Weoceo[6562]: Instance i-681ef101 has not reported statistics (5/5)
(EEE) Sat Sep 29 06:13:26 2007 Weoceo[11310]: Terminating i-681ef101 due to lack of statistics
(SSS) Sat Sep 29 06:13:26 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 1
(III) Sat Sep 29 06:13:26 2007 Weoceo[6562]: Launching 1 instance(s)
(III) Sat Sep 29 06:13:26 2007 Weoceo[11310]: Terminating 1 instance
(SSS) Sat Sep 29 06:14:28 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 1
(SSS) Sat Sep 29 06:15:28 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 1
(SSS) Sat Sep 29 06:16:29 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 1
(SSS) Sat Sep 29 06:17:29 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 1
(SSS) Sat Sep 29 06:18:29 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 1
(III) Sat Sep 29 06:19:05 2007 Weoceo[11351]: Added ID=i-94ce20fd, PublicHost=ec2-67-202-13-222.z-1.compute-1.amazonaws.com, Host=domU-12-31-36-00-1D-B4.z-1.compute-1.internal, PublicIP=67.202.13.222, IP=10.253.34.66
(SSS) Sat Sep 29 06:19:32 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 2
(SSS) Sat Sep 29 06:20:32 2007 Weoceo[6562]: Overall usage = 0% NumInstances = 2

Email warnings were delivered to me 6am on Saturday alerting me to the problem, however I was fast asleep and WeoCEO corrected identified and corrected the problem.

We believe in the future of scalable utility computing. Dealing with events such as these is just a part of the issues with these types of systems that we’ll all have to overcome to make this future work. Our goal is that we can share what we are creating for WeoGeo in a way that helps other overcome such problems.

I do not wish to minimize the impact of this AWS outage, but it would be unrealistic to assume that this type of event will not happen in the future. We should all consider this in building our virtual computing architectures. The use of AWS means that you are outsourcing your metal infrastructure. This means that your system design must be organic and self-healing (see also slideshare link).

Our solution is simple to use and operate, but does expect that you have some working knowledge of EC2. There are others who can help in building these types of architectures on AWS from the ground up (some of those contributed to the above AWS Forum thread including Thorsten at RightScale and Reuven at Enomaly).

WeoCEO was built to help us at WeoGeo survive these types of outages. We are completing our private beta shortly, and are releasing the latest version of WeoCEO that we will be bringing into open beta. Contact us at WeoCEO [at] WeoGeo [dot] com if you would like to participate. Open beta will provide the stable IP addressing and recovery options for one instance for free.

Amazon Web Services StartUp – Boston Presentation

Monday, October 1st, 2007

I was out of town last week. I’ll try and catch up on a number of subjects this week.

One of the reasons I was out of town was that I was invited by AWS to present at their StartUp event in Boston.


A copy of the presentation may be seen on Slideshare.net (or just click on the image embedded above). It was a great event, and I enjoyed sharing the stage with the talented people from AideRSS, Praxeon, and Geezeo. It was good to interact with others who are building (and bootstrapping) new web services using AWS.

I truly believe that utility computing is going to change the way businesses get started and (eventually) operate. However, we are going to have to build systems that are organic in how they handle resources, i.e. scale up and down as a function of load. In addition, these systems need to be self-healing by automatically addressing processor and storage outages.

The importance of self-healing will be evident in the next post.

Image Processing and Delivery Using Virtual Computing on EC2

Thursday, September 6th, 2007

I posted last week about bandwidth issues associated with geospatial data and our AWS S3 solution. The deciding factor for us to use Amazon’s offerings was not necessarily the edge distribution capabilities of S3, but the synergy from combining S3 data storage and distribution with virtual computing capabilities of EC2. There are multiple issues in image processing that require a ton of memory space and CPU horsepower. In both Market and Server, we offer the following basic map distribution options to our map providers -

Geo Clipping (6 zoom levels, allowing for ~125 million possible selections per data set)
Spatial Resampling (4 levels)
Layer Resampling (depends on data)
Output File types (5 – JPEG, GeoTIFF, ENVI, ESRI BIL, ERDAS IMG)
Projections (5 – UTM, Transverse Mercator, Lambert Conic, Albers Equal, Geographic)
Datums (3 – WGS84, NAD 83, NAD 27)

These options result in millions of possible map variants, which preclude the storage of each variant for distribution. So processing power for conversion is critical; and this processing power needs to be connected to a large, web-addressable, temporary data storage array to house the unique variant that a map user has selected. Now for a true mapping marketplace, this infrastructure needs to support 100s to possibly 1000s of simultaneous map requests from the same base map like the 40 GB image in Figure 1. Doing our NeoMapping Market correctly requires the creation of enormous processing, storage, and bandwidth infrastructure.

Figure 1. 40 GB, 156 layer HyperSpectral Imagery (HSI) map listed on WeoGeo Market. (Click on image to go to the listing in the Market).

However, who could afford that infrastructure upfront? Our original estimates for acquiring base computation needs and placing them into a co-location facility were around $500K. While not a lot of money in the scale of today’s internet operations, it was big for us. In addition, we were trying to develop the software architecture to support the Market and Server, and these expenses were large in it of themselves. AWS provided a unique and simultaneous answer to many of our immediate storage, processing, and distribution needs.

Developing our infrastructure on the scalable AWS solution allows us to say we can support the 1000s of map requests required for a functioning digital marketplace. The user experience is vital to the service’s credibility and therefore our success. However, there is a true (and in a number of cases unexpectedly high) cost in this decision. We traded high capital expenditures for high operating expenditures. In an upcoming post, I’ll talk about the Total Cost of Operations (TCO) on AWS, and some of the ways we are moving to reduce these high operating expenses through stability and scaling solutions. Some of these solutions we have turned into products that we provide to others (e.g WeoCEO)..

I would be interested in hearing about the actual experience of others on AWS and whether S3 and EC2 could or could not meet their needs.

Commodity Computing Cycles (C3) and ETech

Saturday, March 24th, 2007

I was preparing for ETech and ran across Jeff Barr’s recent AWS blog.  He points to a number of interesting links, including WeoCEO’s new website (thanks Jeff!). 

One of the links he points to is David Berlind’s video on “Is it time to throw away your servers?“.  It was a highly entertaining video, but more importantly it clearly laid out the business case for why cluster and grid computer is going to revolutionize this business.  We must be channeling the same psychic hotline, because it mirrors the case I laid out in the Cycles in the Sky blog earlier this week.  (However, David’s is far more entertaining, with real numbers.)

Commodity Computing Cycles (C3) is a paradigm shift in business computing.  It is coming, and to be honest, I have no way to predict the impact of the change on efficiency and productivity in the business computer arena.  I do know that in order for it to achieve its potential, those of us focusing on cluster and grid computing have to deliver some sort of Service Level Agreement (SLA).  While David points to the cost advantages, what he did not point out is the lack of an SLA from Amazon.  Someone running an ecommerce site may willingly pay the additional money shown in David’s video for a traditional data center operation, if they can be assured of up-time and bandwidth.  Without these assurances, the dollar savings obtained by using a C3 solution may be given back in poor user experience or web client customer service.

That being said, I think that we (the greater community of Amazon Web Services and EC2 users) are working towards achieving reasonable service levels upon which we can build ecommerce solutions.  We developed our WeoCEO ISO because it was required in order to host our WeoGeo geospatial exchange on EC2.  There are other service issues, such as large file ingestion (imagine trying to push a terabyte size file up to S3!), but we are confident that these too can be overcome and solutions delivered to the community.  I truly believe that the revolution is here, and like any other paradigm shifts, there will be a tremendous opportunity for those willing to place their stakes in the ground to deliver solutions to those who follow.
 

On other notes, I will be taking my soapbox to ETech next week.  Find me if you would like to chat about such things as revolutions and paradigm shifts in cluster and grid computing, as well as geospatial technologies.

WeoCEO – How to Use the True Power of Amazon Web Services

Friday, March 9th, 2007

As mentioned earlier, Amazon Web Services (AWS) is offering an innovative solution that effectively provides a flexible outsourced data center for web services. The Elastic Computing Cloud (EC2) offers access to scalable computing power, with a “pay as you go” approach that allows users to increase and decrease their usage without penalty. This new concept eliminates upfront costs and offers incredible flexibility, but also has some limitations.

These limitations are significant. The most critical issues for EC2 are dynamic IP addressing coupled with a lack of a 24/7 service level agreement. Amazon’s current service agreement does not promise 24/7 operation of an AMI, so when problems bring down your web site’s AMI, you’ll also lose your web site due to the loss of a valid IP address for the “A” domain name record in your DNS service. Not only will you have to restart your AMI and all web services, but you must also repopulate the global DNS tables with the new “A” record.

WeoCEO is a proprietary application originally developed by WeoGeo to manage the use of EC2 in serving WeoGeo clients. This solution has already brought affordable scalability to WeoGeo and its own Web 2.0 applications, and is now being offered as a private beta product for developers of AWS applications, in order to enable others to tap into the true power of AWS.

The WeoCEO application, working within the EC2 environment, eliminates the above mentioned limitations, and maximizes the power of EC2 by providing automatic and instantaneous scaling, load balancing, and fail-safe supports, including a stable IP environment. These critical functions optimize usage and provide true 24/7 operations to make EC2 a powerful, intelligent solution for businesses of any size.

The EC2 model allows scalable capacity to accommodate anticipated changes in traffic levels and growth, but management of this is labor intensive and requires personnel to oversee traffic levels. WeoCEO is an intelligent manager program that fully automates those tasks, and provides efficient usage and appropriate capabilities to handle growth, cyclical needs, and sudden spikes in demand. A sudden influx of traffic generated by something like a TechCrunch or Digg article can cause catastrophic failures at the exact moment a new site is trying to capture users. With WeoCEO, the increased demand is automatically addressed with increased capacity to handle the load without delay. And when demand decreases, WeoCEO eliminates the excess capabilities, to eliminate excess costs.

WeoCEO also provides the critical fail-safe support in case of failure that will ensure true 24/7 operational capability. When problems cause a temporary loss of your website, WeoCEO’s automated system will retrieve a duplicate image, and have your site available again within moments. With redundant systems that automatically regenerate and provide a stable IP address environment, your site’s functionality is maximized.