Archive for the 'FERI' Category

Amazon, WeoGeo, geospatial, FERI, MTurk

Some Thoughts on Mechanical Turk and Geo-Processing

We use Amazon Web Services (AWS) quite a bit. Mostly we use the EC2 and S3, but recently we have been using a limited bit of Mechanical Turk (MTurk) for some testing of the web site.

For those of you who don’t know what MTurk is, from the web site -

…The Mechanical Turk web service enables companies to programmatically access this marketplace and a diverse, on-demand workforce. Developers can leverage this service to build human intelligence directly into their applications.

Our use has been somewhat limited to testing of the web site only. However, there has been some image processing uses of MTurk, including the SAR efforts to find Jim Gray and Steve Fossett.

I wear two hats these days. We are still actively involved in the development of HyperSpectral Imaging (HSI) sensors and algorithms (see the Florida Environmental Research Institute). It was from these efforts that we developed the cataloging, discovery, and distribution systems that we spun out into WeoGeo.

The holy grail of imaging techniques is the automatic extraction of features and classification of materials within the raster data. It is something we have been trying to develop for over a decade. There are others who have been working at it longer.

After all these years, there are some problems that are still difficult to solve in processing imagery. They frequently require just looking at the images frame by frame to resolve features and classify stuff that just defies algorithmic development. It strikes me that there may be some parts of this processing that may not be easily solved using computer algorithms. Things like finding seam lines in overlapping aerial photographs.

Several major imaging vendors send a chunk of their current image processing to low cost countries like China and India to complete their large-scale projects. It seems that there might be a better way to accomplish such geo-processing tasks that still require eyes then to incur the time and expense of sending these tasks overseas. Perhaps Mturk and some smart programming might offer a different approach.

I also wonder what other sort of QC/QA tasks in geo-processing might be solved by MTurk. I might try to kick it around a bit at GeoWeb. Find me if you got some thoughts.

(Ford assembly line, 1913)

Background, WeoGeo, geospatial, FERI, mapping, WeoGeo Server

Follow-up to Direction Magazine’s Podcast on WeoGeo

Adena Schutzberg did a podcast with me last week on the business model for WeoGeo. It was my first podcast and I hope that I made sense to people (I welcome comments and/or critiques in the comments section here). I would like to thank Adena for giving us the opportunity to tell our story.

However, I am not sure I was as clear as I could have been about our history and the importance that history in the development of WeoGeo. I could not quite put my finger on what was missing until after the AWS StartUp Event - Boston (see here as well for my comments) when someone asked how many man-years of effort went into developing the site.

My first response was to take the number of years that FERI was in operation times the number of people involved at FERI. Kind of silly, I know. But when I think of why we built WeoGeo, this response seems relevant. Their response, of course, was, “no really, how much technical development time?” I understood the question; the person was trying to ascertain how difficult it would be to recreate what we are doing.

Our technical development on this project did start back around 2001 with a project called Hyperspectral Data Repository On-line (HyDRO). This was our first distribution system, developed to help alleviate the problems associated in delivering HSI data to our customers. This concept and technology eventually evolved into the WeoGeo Server (see post here as well). Between 2001 and 2005 we had 4 PhDs and masters-trained personal spending a portion of their time on HyDRO because it was a critical element of our research programs. In the last couple of years, we increased the number of people working on WeoGeo Market/Server, to >12 currently if you include outside contractors. For the most part, they are highly trained GIS and MIS/CIS/CS personnel.

The technology is hot, no question about it. I am amazed on a daily basis what our group of people has developed for mapping on both commodity computers and utility computing systems. Yet, here is the rub to this type of man-years calculation. I really believe that the reasons for WeoGeo, and its associated development time, stem from our history at FERI, which makes such calculations difficult. The “technical development time” is not just time spent coding; it includes the needs assessment and the development of the system architecture to address critical problems and/or pain. What we have developed at WeoGeo is a direct function of two critical needs of our operations as a research and imagery services organization.

These two critical needs were (and still are):
1) Delivery of our survey grade, high volume mapping content;
2) Finding and acquiring other survey grade mapping content to fuse with ours to create value-added geocontent for our clients.

WeoGeo was built to solve these two critical problems (there are others, but not nearly as critical to our organization as these). If you have never been faced with these problems, then you might not appreciate the depth of the solutions we have built to service these needs (and its potential). But if you have, then you have felt our pain - and I hope value our solution.

Remote Sensing, Hyperspectral, FERI, mapping

Image Fusion and Sharpening with Multi and Hyperspectral Data

The panchromatic limitations of WorldView-1, recently launched by Digital Globe, have brought a few posts (e.g. free geography tools and the confused life) on the fusion of high spatial resolution panchromatic imagery (PAN) with lower spatial resolution multispectral imagery (MSI). I thought I would briefly comment on image fusion because over the years it has become easier to accomplish, but the results or limitation of the fused product may be difficult to understand.

There are many ways to accomplish pan-sharpening including band substitution, color space transformation and substitution, and Principle Component Substitution (Jensen,2005). As mentioned on the confused life, temporal decorrelations introduce artifacts into a fused, or PAN-sharpened image. However there are other artifacts that can be equally important if one is trying to create a quantitative product for classification mapping or target detection.

The inherent difficulty with all of the PAN sharpening methods is that they are fundamentally based on the technical and environmental conditions under which the PAN imagery was collected. Since it is difficult, if not impossible, to accurately correct for illumination and atmospheric conditions in PAN sharpened imagery (subject for a much longer post), the PAN-sharpened images may be limited to classification and detection within a scene. Inter-scene comparisons (i.e. change detection between scenes or cross scene classifications) using spectral properties require the aforementioned corrections. In addition, when the instantaneous field of view (IFOV) of the PAN and/or MSI sensors are too large, spectral and illumination changes will be present at the edges of the image, making even within scene classifications difficult. Because of these issues, PAN-sharpened multispectral images are frequently used to identify features based on relative color differences within an image, rather than target identification or environmental characterization based on a spectral signature itself.

Figure 1. The fusion of high spatial resolution MSI (left figure) with lower spatial resolution HSI (middle figure) into a high spatial resolution, high spectral resolution image (right image). The bottom row of images represents the spectral plots at the pixel located at the center of the red cross hairs in the images directly above them.

We have done some work in this area, mainly focused on sharpening hyperspectral imagery (HSI) with multispectral imagery (MSI). Figure 1 shows the results of some of our efforts. The left image is a high resolution MSI from an Applanix DSS. Underneath it is the digital value of the RGB channel of the image. The middle image is the lower spatial resolution HSI; and underneath it is the full spectrum resolution of the HSI vector (~3 nm resolution). By fusing these two images together (right image), we were able to create a high spatial resolution sharpened HSI image whose spectral vector matched reasonably well with the spectral vector from the original HSI image. The use of atmospheric- and illumination-corrected HSI imagery means that we could make classification comparisons or target detections using these spectra much more robustly across scenes in time and space.

When making fused, or derivative, mapping products the value of the map is critically determined by the base mapping material and the skill of the map producer. Understanding the limitations of the base mapping material as well as the fusion techniques themselves is a critical determinate in the value of a derivative mapping product.

References
Jensen, John J., Introductory Digital Image Processing: A Remote Sensing Perspective. Prentice-Hall, Englewood Cliffs, NJ, 2005, 526 pp.

Storage, Background, Amazon, FERI, mapping, WeoGeo Server

How do you deliver 100 40GB imagery files?

This is a bit tougher than the solution discussed in this earlier post. When we (FERI) first started developing HSI sensors and flying them for others, the distribution of imagery data was mainly through DVDs. As the research groups got larger, we started getting more and more requests for data. This eventually led to the WeoGeo Server solution, which allows for customization and asynchronous delivery.

However, 100 40GB files that look like Figure 2 in my HSI post means 4TB of data through our lab’s pipe in a relatively short period of time. Our bandwidth at the time we were trying to develop these solutions was a dedicated T1, or 1.5 mbits per second. To transfer 4 TBs of imagery files with full access of our pipe would require 259 days.

Clearly there are some solutions these days that would have helped this type of large file distribution effort. Akamai, Limelight Networks, or some bittorrent solution would provide capabilities to deliver large files over distributed networks. However, we were also providing search and customization solutions, which required modification of the data before delivery. This meant that we had a scalability problem in processing as well as delivery. Edge distribution solutions would solve one part of our problem, but not necessarily the processing part.

We began to explore co-location solutions, but these seemed to require a lot of upfront costs, as well as travel and maintenance expenses. As a small business, those capital expenditures were more than we could absorb. It was at this point that we were introduced to Amazon Web Services by a former co-worker who had been recruited by Amazon. AWS allowed us to build a distribution of large data files on top of a very large pipe via S3. (I’ll discuss the processing using EC2 later). It provided us scalable distribution at reasonable cost for those 100 40GB files.

To be honest, there are some devils in the details in using S3 for our operations. But (to date), the service has been more valuable than costly. The rapid ingestion of large files into S3 is a current problem that we are trying to solve. Moving forward we hope to build on the expansion of S3 as Amazon develops more physical data storage locations. This will provide us with some of the edge distribution advantages of the above solutions, while keeping us connected with our virtual computing solutions on EC2.

I’m also curious to see how others are using S3 in geospatial solutions; if you have a unique one, please let me know.

Background, Hyperspectral, Amazon, WeoGeo, grid computing, FERI, WeoGeo Server

40 GB Imagery File Redux

An obvious question that drops out of yesterday’s post on the right file format to use to distribute large raster files is, “How do you distribute a 40 GB file?” The distribution of a single 40 GB file would overwhelm the bandwidth of many small businesses. That was one of the reasons we originally developed the WeoGeo Server.

Figure 1. WeoGeo Server (click on the image to see more information)

The Server allows the mapping organization to distribute customer-defined customized products that would reduce the required file size, and thus bandwidth, to satisfy their customers’ demand. However, there is still the use case where the customer wants the whole file.

Since FERI is a small business, we couldn’t have our daily research activities impacted by an imagery request. So the first (obvious) step was to develop a customization and distribution system that processes a data request in an asynchronous manner, i.e. the order is taken during business hours, but it is processed and delivered after business hours. This allowed us to optimize our bandwidth in our labs and still reasonably satisfy customer demands (assuming they did not need instantaneous data delivery). We also tweaked the system to allow some small files and all of our own requests to be processed immediately, while larger ones for external users were processed in the evenings.

The asynchronous data delivery is also a fundamental difference between our technology and online GIS servers. We optimized for discovery, customization, and ordering in a way that allows the customer to receive near-instant gratification on the discovery and ordering, while (possibly) delaying gratification on the delivery.

While the customization of product selection and the asynchronous processing and delivery bought us some additional help in terms of distributing large geospatial content files, it still did not help us with the problem of what to do with multiple requests for 40 GB image files. This is where some of my earlier posts, where I described our use of Amazon Web Services, begin to make some sense (and maybe why Jinesh digs what we are doing).

However, I am late for dinner, so I’ll pick up this theme on a later post…

Background, Remote Sensing, Hyperspectral, WeoGeo, FERI, mapping, BigTIFF

What file format do you use for a 40GB image? (BigTIFF!)

Large imagery files are a problem. In the hyperspectral world, we send things via ENVI’s file format (BSQ, BIL, or BIP). ENVI was designed by folks doing HSI remote sensing and was optimized to easily handle large raster images. The use of this file format allows us to deliver extremely large raster files, with a separate header that described all the channels, bands, or layers in the image.

Unfortunately, not everyone owns a copy of ENVI. It is an expensive image processing package. While other remote sensing and GIS packages claimed to handle multi-band imagery data, we found that support for imagery with bands n > 3 was difficult at best. So if our customers at FERI didn’t have ENVI, the transport of the imagery had to be accomplished in another file format. The most common format other than the ENVI format for us was GeoTIFF.

Unfortunately, the GeoTIFF format is limited to 4 GB. This is clearly problematic for the image shown here in Figure 1.

Figure 1. HSI imagery of St. Joseph Bay, FL (click on the image to see the data set at WeoGeo Market.)

This image is 156 band hyperspectral mosaic. The entire image at is native spatial resolution equals 40 GB in size. Cutting this data into 10 tiles of 4 GB a piece would be one way to deliver this data set. But this is problematic for both us and the receiver of the images, as the time, energy, and effort to tile and then re-mosaic is less than efficient.

You could also say that for the most part that HSI data is a relatively small backwater of the remote sensing community, so why worry about it. To this I would respond with this imagery that we collected at the same time in Figure 2.

Figure 2. 3-Band DSS imagery of St. Joseph Bay, FL (click on the image to see the data set at WeoGeo Market.)

This is a 3-band RGB from an Applanix DSS. The resolution was about 1/6 the spatial resolution of the HSI sensor. The higher spatial resolution makes this image nearly as large as the HSI image. We actually incurred the pain of tiling the full image set for our original customer because they had only ESRI software with which to analysis this image.

Our friends at GDAL asked us about sponsoring a new file format, BigTIFF, which would be based on extending the TIFF format. We were happy to step up to help make this happen. I believe that the other sponsors had similar file storage and distribution issues, and we look forward to broad acceptance of this file format.

It will certainly make our distribution issues easier.

Background, Remote Sensing, Hyperspectral, WeoGeo, FERI, mapping

HyperSpectral Imaging (HSI) and the Path to a Digital Marketplace

WeoGeo was born from a need to preview, share, and distribute geospatial content. Our experience with this goes back nearly 9 years in developing a technology called environmental HyperSpectral Imaging (HSI) spectroscopy (see our non-profit research efforts at the Florida Environmental Research Institute). HSI technology is built upon collecting images at many narrow discrete wavelengths to build up a calibrated spectrum for each pixel in the image (Figure 1). Each of these discrete wavelengths is stored as a unique spectral channel yielding dozens, even hundreds, of bands of color information (as opposed to consumer cameras with three bands: Red, Green, and Blue). We created some novel techniques (including WeoGeo) to process, store, and deliver those hundreds of bands efficiently.


Figure 1. HyperSpectral Imaging Concept.

HSI is not a new field. The US government has been actively supporting it development for over 2 decades. The best known aircraft HSI instrument is run by NASA JPL. They have been operating the AVIRIS sensor since the early 1990’s for earth sciences studies. Two recent satellite HSI missions include NASA’s Hyperion and ESA’s CHRIS sensors. Our contribution to this field has been focused on dark target spectroscopy for water applications. Our primary patrons in the development of HSI for water have included the Office of Naval Research (ONR) and the National Oceanic and Atmospheric Administration (NOAA). Both agencies have an interest in finding and identifying things in the water using automated targeting and classification techniques. Basically we have been trying to “see” through the water to determine the depth of the water, the bottom habitat, and the water quality (Figure 2).

Figure 2. Imaging through the water. The color of light leaving the water is affected by the depth of the water, the stuff in the water, and stuff on the bottom.

Water is called a “dark target” because the reflectance of light from beneath the water is usually less than 1%. (“bright” land targets can be greater than 50%). This is important for signal processing where the quality of the feature map is strongly dependent on the signal to noise in the imagery, which is directly dependent on the target reflectance. The Spectrographic Aerial Mapping System with On-board Navigation (SAMSON) that FERI built and deploys is specifically designed to simultaneously handle bright and target targets.

Figure 3. FERI’s Spectrographic Aerial Mapping System with On-board Navigation (SAMSON; top image) and Ground Processing Unit (GPU; bottom image).

During September of 2006 FERI conducted a mission for NOAA to demonstrate the capabilities of HSI for detecting red tides. Figure 4 shows some results from one the largest Harmful Algal Bloom (HAB) ever recorded in the US. This three band false color composite was created with 3 narrow bands in the blue, green, and near infrared from the full 188 band hyperspectral imaging cube.


Figure 4. False color composite of red tide in Monterey Bay created from HSI image.

An example of how imaging spectroscopy is useful in quantitatively determining the extent of the HAB in this region may be seen in Figure 5 where the full spectra (uncorrected for atmospheric interference and illumination effects) is shown in comparison to a spectra collected outside of the red tide region. The biggest difference is seen in the near infrared region which is responding to increased reflectance of light by the dinoflagellates in the bloom.

Figure 5. A quantitative look at the spectra from an HSI image inside and outside of the bloom. The green line is the spectra inside of the bloom, the pink line is from outside of the bloom. The big difference around 710 nm results from the large numbers of dinoflagellates that reflect light out of the water. A different effect accounts for the difference seen in the 400 to 600 nm range where the dinoflagellates have pigments that absorb light. These pigments result in less light being reflected out of the water where high concentrations of these dinoflagellates are be found.

The more subtle differences in the blue and green regions relate to the differences in absorption of light by the pigments in the dinoflagellates. The change in relative reflectance is what gives this bloom its characteristic “red” color (Figure 6).

Figure 6. Red tide (HAB) as seen from the research vessels collecting data during the experiment. (Photo courtesy of Dr. R. Kudela, UCSC.)

An advantage of HSI is automatically rendering data into feature extracted maps. Automated, in this case, means that an algorithm (as opposed to an expert) can render the imaging data stream into maps of bathymetry, red tides, sea grass beds, wetlands vegetation, habitat maps, land use change, etc. Automated is important because these imaging data can be terabytes in size. The time requirements just to load the imagery into computer memory for viewing and editing can be onerous. Trying to manipulate and analyze the imagery for features, targets, and materials taxes the time and computer systems requirements to the point of making HSI technology and products the realm of the few.

The ideal approach is to use well calibrated sensors to remove atmospheric and illumination effects (the subjects of future blog entries) to generate HSI imagery that can be directly processed into target and feature maps during the initial image processing. This approach can render products like Figure 7 in less than 8 hours of processing on FERI’s field processing station (right side of Figure 3). These map products are much smaller in size than the original imagery data and contain valuable information for users that are unfamiliar with spectroscopy itself. Using automated feature extraction techniques with HSI provides a mechanism for mapping our world more quantitatively and more frequently than is currently being accomplished with traditional field and photogrammetry techniques. It is the future of remote sensing.

Figure 7. The concept of automated feature extraction and classification applied to the wetlands of Morro Bay, CA using HSI data.

The concept for a server that could handle TBs of HSI imagery was originally conceived as a mechanism for FERI to serve its research partners. WeoGeo Market and Server took this concept and expanded it to handle a larger number of map forms, in a more intuitive manner. The Market provides a portal where other can contribute their value-added mapping content and be compensated. Server gives an enterprise the ability to manage its geospatial content, as well as easily monetize that content. Together they help address what became one of our hardest technical challenges at FERI – How do we serve our partners the maps that they want?

Amazon, WeoGeo, FERI, mapping

AWS and Web 2.0 Mapping

I have been a bit delinquent in posting to this blog as of late. I am shaking the dust off of my blog because of the post that Jinesh Varia made about WeoGeo. Mapping, particularly quantitative mapping like GIS, and AWS go together like peanut butter and jelly (I have 3 small kids who have been out of school all summer, so this was the first analogy that came to mind). The utility computing of EC2 and the large web-addressable disk storage of S3 provide opportunities for developing and sharing of mapping products that previously were cost prohibitive. Being Jinesh’s favorite in this category is way cool (and I plan to send him a PB&J for lunch).

We have been very busy, with some real exciting things happening. I hope to share many of them shortly. One of the things we have been working on is the delivery of our first WeoGeo Server to the College of Ocean and Atmospheric Sciences at Oregon State University. You can see their front page here, but you have to register to get access. Access is currently limited to those involved with a red tide experiment in Monterey Bay, CA during September 2006. (We were involved in the NOAA experiment through FERI, operating our HyperSpectral Imaging (HSI) system.) In addition, we have been working on bringing the Seller site of WeoGeo Market out of Private Beta.

I know I have been remiss on posting, but between the kids’ summer vacation, the delivery of Server, beta responsibilities for WeoGeo and WeoCEO, and the scientific responsibilities of FERI, I have let the job of blog posting slide. I promise more posts on imaging sciences, GIS, and utility real soon.

Background, Amazon, WeoGeo, FERI

Building a Web 2.0 Mapping Solution

I am writing from San Francisco today, where I am attending both the Web 2.0 Expo and Location Intelligence conferences. I have found that the serendipity of discovering real potential value in the concept of “Web 2.0” while developing our solution for B2B mapping a bit humorous. My original take on the Web 2.0 business was that it was all about social networking and advertising. However, our industry (the global mapping industry) is ripe for a true SOA solution, and we are trying to build something that will release the potential of both the internet and mapping beyond just the ability to share mashups. In order to accomplish our goals we needed to overcome some critical infrastructure hurtles in the development of a platform that allowed real internet commerce to proceed within the mapping industry. As I am preparing for both of these conferences this morning I thought I would begin to share some thoughts on how we are planning to build a SOA, which may be considered a Web 2.0 application.

The global mapping industry is a $4 to 7 billion a year market (depending on which report you read). It is a B2B industry, dominated by large investments in infrastructure (think satellites, airplanes, computers, software, and content), as well as large investments in highly skilled technicians. The data volumes are enormous; our own mapping efforts (at FERI) run upwards to 10 terabytes of mapping products per day, requiring multiple distributed processors just to generate the maps, which we then have to serve to clients and users in near-real time. WeoGeo (www.weogeo.com) is our B2B portal and server solution to rapidly delivery mapping products to end user customers.

Imagine building the computational and internet infrastructure to deliver gigabyte to terabyte size maps. A terabyte map takes 90 days to be transported over a 1.5 megabit per second link. WeoGeo has developed the technology to dramatically reduce this effort, but to service a global market of such maps would require mind boggling infrastructure support. Enter Amazon Web Services (AWS).

The initial beauty of AWS is in the cost structure, where we are paying for our computing time (EC2) and data storage (S3) on a pay-as-you-go basis. Our initial budgeted start-up infrastructure costs were ~$300,000 plus first year expansion ~$200,000. When we budget the same effort on AWS, its pro forma was somewhere between $10,000 to $20,000. AWS allowed us to spend our limited start-up dollars on developing the technology of WeoGeo, rather than buying and maintaining computers. But the initial beauty is quickly overtaken by something a bit more sublime. With EC2 and S3 our processing and storage requirements are totally scalable. The term scalable is so prevalent in today’s business press that it often loses its significance. However, scalable to us has very significant time lag and costs implications. Besides pitching to potential customers who have map archive inventories approaching petabytes, we are talking about a web services business that currently counts 200 million Google Earth users. If we are as successful as we hope to be, an exponential growth in business would rapidly overcome our abilities to assemble hardware, much less install and maintain servers to service the business. Our business requires scalability, with a capital S.

So we made a bet at the beginning of WeoGeo that a business model built on commodity computing cycles or elastic computing, as opposed to commodity computers would best enable us to handle growth in this industry. The fact that the upfront cost was cheaper was a bonus. The bet required focus, so we decided to make an all inclusive AWS service platform that required no outside data center processing or storage. For this to work, our web and data base services had to be robust and durable in a virtual machine environment that in it of itself might not be durable. It had to handle spiking (think “Digg Insurance”) and cyclic patterns in processing to assure up-time and optimize costs. It also had to address load balancing and stable IP addressing in an environment where the virtual machines’ IP addresses and domain name records may be lost.

With a lot of brain sweat and great interaction with the AWS team, we created an internal EC2 management solution that accomplished these goals. After some prodding by the AWS team, we have begun to offer one of these solutions as a product. WeoCEO (www.weoceo.com) is a management solution for stable IP address, fail-safe monitoring, load balancing, and auto-scaling of EC2 resources. Besides the insurance aspect of this solution that provides for robust ecommerce activities, the auto-scaling feature actually provides a tremendous cost savings over daily and seasonal cyclic usage patterns. We look to providing an extension of the WeoCEO services for durable database operations in the near future.

In short, the mapping industry is competitive B2B market that has high infrastructure costs to support large processing and storage requirements. WeoGeo has created an SOA on AWS that will allow for the unleashing of huge volumes of archived mapping products to create a geospatial information exchange that will scale from the smallest to largest users. While cost containment will certainly be a key component to our viability, we believe that quick, reliable, and scalable service will be more important to our eventual success.

Adena Schutzberg at Directions Magazine (which is hosting the Location Intelligence conference) has indexed a podcast to be available on April 17th titled, “Is Web 2.0 Mapping “Dead”?”. All I can say is that we don’t think so (and I sure hope not).

Background, Remote Sensing, Hyperspectral, Amazon, FERI

Mapping with Amazon’s Mechanical Turk

I was saddened today by the news of Jim Gray. I heard about it from my colleague who pointed me to the efforts of Michael Arrington at TechCrunch and Werner Vogels at Amazon. I feel somewhat connected to the effort because of the hours spent on Michael’s site, and our development of a new internet business using Amazon’s S3/EC2 systems. Mostly I feel connected because finding things in the ocean using imagery is what we do.

My first thought was we can help, particularly after I saw that the NASA ER2 flew with a hyperspectral imager. This is what we do. We recently demonstrated (see here as well) the capability to NOAA NESDIS to collect and process nearly 4000 square kilometers of coastal ocean hyperspectral (5 m resolution, 256 channels in the visible and near infrared) and multispectral (0.8 m resolution, 3 channels) data in less than 18 hours. Our flight imagery is ~1 TB in raw form, and up to 5 TB processed, and we are some of the best people I know at the imagery and processing game. I figured that since we have an EC2/S3 account for WeoGeo, so we could upload some of our image processing software and get in there and help.

It was then that my colleague had to rein me in. Jim Gray had been missing since last Sunday, and the ER2 data was very limited. The oceanographer in me took a deep breath, and after reviewing more about the availability of the imagery, I realized there was probably very little that we could do to help. The ocean is a big place, and while the amount of imagery was large, the ocean was a lot larger.

In addition, the visible imagery was limited to just a few bands. Just a few bands means that there are limited degrees of freedom to use automated feature extraction techniques (that is a techie term that just means to use the computer to sift through the imagery to yield the information for which you are searching). The fewer the bands, the more that sensor, illumination, and environmental noise dominate the imagery, the less likely you will be able to find the object of your search.

Werner Vogels sought to use one of the best tools he had available, the Mechanical Turk. It was one of the quickest methods to put eyeballs on the imagery. By using S3, they had the means to store and distribute large volumes of imagery. Unfortunately, people’s eyes are just not that sensitive to noisy, low spectral information. It is very hard to “see” something in ocean imagery. Particularly if it has been compressed in some part of the processing, which frequently removes all the targets you are interested in finding. That’s why we use high resolution spectral and spatial data and develop the processing algorithms to have the computer render these volumes of data into the maps that tell us something important. In military parlance, it is call actionable geospatial intelligence. In this case, it is about saving lives.

Spectral imaging is not the only means to find things on the water. There are other systems that can be used for ship tracking. Microsoft’s Vexcel has the capabilities to use SAR data for this purpose, and I am sure they will put these to use. It is a credit to Werner and the community that the have been able to respond as rapidly as they have. However, I am still feeling a sense of failure. Our community (scientific, engineering, imaging, GIS, etc.) knows how to accomplish these types of mapping goals to save lives and property. The problem is that there has not been enough demand in the results to justify the expenditures at the current price of the systems and products.

The systems that we fly are $1 million+. The processing costs are $10,000s (sometimes up to $100,000s) per day of operation. The issue is one of scalability and demand pull. For an integrated Search And Rescue (SAR) system to have provided help to Jim Gray, it would have needed to be a fraction of those costs, rapidly deployed on manned and unmanned vehicles flying at high altitudes (including space), delivering actionable maps within hours (if not minutes) of landing or downlink. Such technology is obtainable, but the capital investment is large.

We are trying as hard as we can, to the best of our abilities, to change the mapping game by creating and sharing knowledge, not just pictures. This will take time.

My heart and prayers go out to the friends and family of Jim Gray. I just wish we could help today.

 

Update: 1730 EST, February 5, 2007
I spoke with a contact at NASA JPL. It appears that the NASA ER2 flew without the hyperspectral sensor, but with another imaging package. WPB

Next »