Archive for the 'mapping' Category

Remote Sensing, WeoGeo, mapping

Aerials Express Signs Up for WeoGeo Market

How do you make a geospatial exchange a reality? You find great content providers to bring their wares to the market. Aerials Express (AEX) is one of those great content providers. With 420,000 square miles of high resolution aerial imagery over major metropolitan areas in the US (see map below), AEX brings base map content to “prime-the-pump” in the derivative product marketplace.

Christopher Warren and Bill Landis at AEX have been great. Their listings of AEX products address a big niche in our industry. High resolution imagery that can be physically acquired and manipulated with an explicit license to resell derivative works. Bill’s quote from the Press Release -

WeoGeo is an excellent opportunity for our company, said Bill Landis, President of Aerials Express. We are looking to WeoGeo’s advanced technology and unique distribution model to enhance the availability of our products into a wider range of GIS related markets.

It says a lot about the potential of an exchange-based market for our industry.

We will do our absolute best to make the market technology easy to use for search, discovery, and product acquisition. Its success will increase productivity and margins for all of its participants. Today, we mark its beginning.

Amazon, geospatial, mapping

Innovation in Web Mapping Systems

There is a nice discussion happening on James Fee’s Blog about Web Mapping Systems and Services and the future of hosted mapping services. I was reading it and thought back to an interesting Wall Street Journal article on Monday about Circuit City that said same store sales in December fell by 12% in the US. While this news was depressing for the stock market, the silver lining for the geo-community was that navigational products were the only product line with increasing sales over the period.

Geo-devices are becoming more ubiquitous. The shear number of curious and talented people moving into our industry combined with these devices will drive product and service innovation in directions that may not be completely clear at the moment.

Converging with the mass market penetration of geo-devices and geo-content (geoware?) is the cloud computing efforts by AWS (and soon to be others). While the production of quality mapping today may require high end desktop workstations and servers, I think that Moore’s Law is eventually going to allow our field to produce geo-content and services far more easily, leading to a feedback into future product innovation. How we in the professional community create products and services today may be radically different in the future.

I offer this anecdote - today, after 10 years of running a Microsoft Exchange Server for our email requirements, we switched to Google Mail Premium. Over the 10 year period, we incurred costs of $10,000s, possibly greater than $100,000. These costs included licensing, hardware, server room, service personnel, etc. Our spam filter alone on the MSFT Exchange Server costs us $35 per year per mailbox. Our costs for Google Mail Premium service is $50 a mailbox per year. It is an easier to use, cheaper to implement, and offers more robust service than the Exchange product.

I think there might be parallels for our industry in this anecdote. It is probably a good exercise to be thinking about what products might be replacing the ones we are using today.

The future of GIS, geo-content, geo-entertainment, etc. will belong to those who can think outside of the traditional methods of production and product delivery. For historical evidence of the difference between companies that focus on the future and those that focus on their current narrow niche, look at the change in market capitalization of Trimble (TRMB) and Garmin (GRMN) over the last decade.

Above Chart taken from Google Finance

Amazon, WeoGeo, geospatial, mapping

WeoGeo’s Mapping Marketplace Makes Final Cut in Amazon’s Start-Up Challenge

The only thing I can say is, “Wow!” Followed by the biggest grin you have ever seen on my face. As one of 7 finalists, Amazon expresses their confidence in our technology and business strategy. In all honesty, I am humbled and honored by the selection, and truly thank them for their selection of us as one of the 7 finalists.

I believe (passionately) in what we are trying to create. I believe that WeoGeo will change the paradigm in how we discover and access geo-content. I believe that we (the geospatial industry) as a community will more easily synthesize new mapping products that will help us create a better world. But these are my beliefs, and I tend to view everything we do through these rose colored glasses.

The selection as a finalist by Amazon Web Services (AWS) means that someone else out there sees the same potential for the mapping and geo-content industry as we do. It provides validation for the people who have worked so hard on this project beyond anything that I could offer, and for this I am eternally grateful.

In addition, Amazon will offer the winner of this contest a venture investment. I believe this says a lot about the geospatial industry, as well as WeoGeo. For WeoGeo to be among those considered a suitable investment opportunity by a $32 billion dollar company, we must have (1) a great business plan, (2) a great set of technology, and (3) be in an industry with high growth potential. Our industry, the geospatial industry, is now recognized by a leader in internet services industry as having high growth potential.

I’ve been grinning so much, my face hurts…

Background, WeoGeo, geospatial, mapping

Profiting from Collective Intelligence

I have had a number of questions from our private beta Providers that basically ask, “What maps should I be making?” To be honest, I wish I knew. In reality, WeoGeo Market was established to answer this very question.

We set up WeoGeo to lower the risks of creating and selling mapping products (most importantly by reducing marketing and transaction costs). We believe that by lowering the risk of creating and selling geo-content, more products could be created at lower prices. By combining more products at lower prices with a greater ability to find and customize those products, Users of those products would be more apt to purchase more geo-content. The overall goal is to create a truly functioning marketplace for geo-content. The end result would be a collective intelligence expressed through the market that would help all of us focus on making the most valuable geospatial products.

Answering the question of what product to create is one of the hardest parts of running FERI’s operations. While FERI is a research and development organization, we still had to perform the basic sales and marketing efforts of finding paying customers to support our development of value-added mapping products. It is a very time consuming and difficult process, requiring a lot of telephone calls and a lot of travel to search out programs that would value our imaging and mapping efforts. Such is the nature of sales and marketing, and every good salesman would tell you that is just the way you have to generate business.

However, as a scientist I want to focus on the generation of new mapping products. While I could (and still do) focus on sales and marketing, my real interest is in generating new mapping products that could help people make decisions with their resources or help save lives. With the hyperspectral imagery, we could develop maps focused on a variety of topics. These maps could range from Harmful Algal Blooms (HABs or more commonly called red tides) to Submerged Aquatic Vegetation (SAV) to detecting probable locations of Improvised Explosive Devices (IEDs). Yet, finding sufficient demand for these products to overcome the high initial production cost of creating these products is difficult. (I have a whole other story on the IEDs and how the DoD does business with contractors and appropriation earmarks that I’ll save for another time.)

Over the years, we have watched with great interest how the internet has impacted other businesses. One of the most interesting impacts that we have seen is the rise of shared intelligence from the accumulation of individual choices. For example, search engines have used the individual linkages of web page creators to develop a collective intelligence estimate of the most likely desired result for a search term (and a new industry of search engine optimization). In particular, we were fascinated by eBay’s ability to enable millions of people to develop larger markets for their niche products.

By establishing a functioning marketplace for these niche goods, eBay created liquidity and demand for products that previously had limited marketability. In the process of creating a market whose niches could be efficiently filled, they also provided opportunities for entrepreneurs to develop new markets. In effect, eBay created a platform that enabled individuals to make choices, create products, and satisfy the needs of others, which in turn created a positive feedback mechanism for everyone who participated. This led to the creation of whole businesses that did not exist prior to eBay, and the rise of the valuation of goods that previously had limited market enetration, and thus, underdetermined recognized value.

The increased liquidity of products and the collective actions of many individuals led to a self-sustaining marketplace that enriches all of the participants. eBay is a lesson in economic theory, and gives truth to the concept that “a rising tide lifts all boats.”

So what does this have to do with answering the question from our Providers about which maps to produce? The answer to that question is that I am not sure, but I can make sure that the Provider’s risk is low enough for them to make some reasonable choices, and to give them the agility to respond to market demands. Through this process, I believe that our collective intelligence will point Providers in the most profitable direction.

This marketplace will give those with the skills and those with the content the ability to connect as never before possible. The new network of connections will lead to the creation of new geo-content that will enhance and enrich the lives of our community. And our community will profit from it because we will know which maps to make.

Background, WeoGeo, geospatial, FERI, mapping, WeoGeo Server

Follow-up to Direction Magazine’s Podcast on WeoGeo

Adena Schutzberg did a podcast with me last week on the business model for WeoGeo. It was my first podcast and I hope that I made sense to people (I welcome comments and/or critiques in the comments section here). I would like to thank Adena for giving us the opportunity to tell our story.

However, I am not sure I was as clear as I could have been about our history and the importance that history in the development of WeoGeo. I could not quite put my finger on what was missing until after the AWS StartUp Event - Boston (see here as well for my comments) when someone asked how many man-years of effort went into developing the site.

My first response was to take the number of years that FERI was in operation times the number of people involved at FERI. Kind of silly, I know. But when I think of why we built WeoGeo, this response seems relevant. Their response, of course, was, “no really, how much technical development time?” I understood the question; the person was trying to ascertain how difficult it would be to recreate what we are doing.

Our technical development on this project did start back around 2001 with a project called Hyperspectral Data Repository On-line (HyDRO). This was our first distribution system, developed to help alleviate the problems associated in delivering HSI data to our customers. This concept and technology eventually evolved into the WeoGeo Server (see post here as well). Between 2001 and 2005 we had 4 PhDs and masters-trained personal spending a portion of their time on HyDRO because it was a critical element of our research programs. In the last couple of years, we increased the number of people working on WeoGeo Market/Server, to >12 currently if you include outside contractors. For the most part, they are highly trained GIS and MIS/CIS/CS personnel.

The technology is hot, no question about it. I am amazed on a daily basis what our group of people has developed for mapping on both commodity computers and utility computing systems. Yet, here is the rub to this type of man-years calculation. I really believe that the reasons for WeoGeo, and its associated development time, stem from our history at FERI, which makes such calculations difficult. The “technical development time” is not just time spent coding; it includes the needs assessment and the development of the system architecture to address critical problems and/or pain. What we have developed at WeoGeo is a direct function of two critical needs of our operations as a research and imagery services organization.

These two critical needs were (and still are):
1) Delivery of our survey grade, high volume mapping content;
2) Finding and acquiring other survey grade mapping content to fuse with ours to create value-added geocontent for our clients.

WeoGeo was built to solve these two critical problems (there are others, but not nearly as critical to our organization as these). If you have never been faced with these problems, then you might not appreciate the depth of the solutions we have built to service these needs (and its potential). But if you have, then you have felt our pain - and I hope value our solution.

Remote Sensing, Hyperspectral, FERI, mapping

Image Fusion and Sharpening with Multi and Hyperspectral Data

The panchromatic limitations of WorldView-1, recently launched by Digital Globe, have brought a few posts (e.g. free geography tools and the confused life) on the fusion of high spatial resolution panchromatic imagery (PAN) with lower spatial resolution multispectral imagery (MSI). I thought I would briefly comment on image fusion because over the years it has become easier to accomplish, but the results or limitation of the fused product may be difficult to understand.

There are many ways to accomplish pan-sharpening including band substitution, color space transformation and substitution, and Principle Component Substitution (Jensen,2005). As mentioned on the confused life, temporal decorrelations introduce artifacts into a fused, or PAN-sharpened image. However there are other artifacts that can be equally important if one is trying to create a quantitative product for classification mapping or target detection.

The inherent difficulty with all of the PAN sharpening methods is that they are fundamentally based on the technical and environmental conditions under which the PAN imagery was collected. Since it is difficult, if not impossible, to accurately correct for illumination and atmospheric conditions in PAN sharpened imagery (subject for a much longer post), the PAN-sharpened images may be limited to classification and detection within a scene. Inter-scene comparisons (i.e. change detection between scenes or cross scene classifications) using spectral properties require the aforementioned corrections. In addition, when the instantaneous field of view (IFOV) of the PAN and/or MSI sensors are too large, spectral and illumination changes will be present at the edges of the image, making even within scene classifications difficult. Because of these issues, PAN-sharpened multispectral images are frequently used to identify features based on relative color differences within an image, rather than target identification or environmental characterization based on a spectral signature itself.

Figure 1. The fusion of high spatial resolution MSI (left figure) with lower spatial resolution HSI (middle figure) into a high spatial resolution, high spectral resolution image (right image). The bottom row of images represents the spectral plots at the pixel located at the center of the red cross hairs in the images directly above them.

We have done some work in this area, mainly focused on sharpening hyperspectral imagery (HSI) with multispectral imagery (MSI). Figure 1 shows the results of some of our efforts. The left image is a high resolution MSI from an Applanix DSS. Underneath it is the digital value of the RGB channel of the image. The middle image is the lower spatial resolution HSI; and underneath it is the full spectrum resolution of the HSI vector (~3 nm resolution). By fusing these two images together (right image), we were able to create a high spatial resolution sharpened HSI image whose spectral vector matched reasonably well with the spectral vector from the original HSI image. The use of atmospheric- and illumination-corrected HSI imagery means that we could make classification comparisons or target detections using these spectra much more robustly across scenes in time and space.

When making fused, or derivative, mapping products the value of the map is critically determined by the base mapping material and the skill of the map producer. Understanding the limitations of the base mapping material as well as the fusion techniques themselves is a critical determinate in the value of a derivative mapping product.

References
Jensen, John J., Introductory Digital Image Processing: A Remote Sensing Perspective. Prentice-Hall, Englewood Cliffs, NJ, 2005, 526 pp.

WeoGeo, geospatial, mapping

How do you connect “Islands of Information”?

The worldwide spatial information management industry has been estimated at ~$50 billion. While large, the industry is dominated by specialization and niche practices that have reduced the flow of spatial information between location-aware enterprises. This reduction in information flow decreases efficiency and productivity within enterprises, and between industries.

Let us examine the different vertical markets that make up the spatial information industry, including urban planning, emergency response, real estate, natural resource management, environmental protection, agriculture, asset management, construction, advertising, etc. They all use slightly different tactics to acquire their spatial awareness or geospatial intelligence (Figure 1; this figure and the next are from a 2007 Where 2.0 presentation. If you are interested in the full presentation let me know.). However, all of these industries have very similar needs in that they require high quality maps to make fundamental (insert your favorite term here, e.g. business, asset, resource, targeting, etc.) decisions.

Figure 1. Vertical silos in the spatial information business keep the markets small and separated.

If we can break down these vertical silos, such that the maps in one niche were used as raw material into the next niche, we can re-order our geospatial markets to look like Figure 2. Here, the silos become building blocks for higher valued information products, which in turn are used as base products for higher valued geo-enabled processes. These building blocks now increase business process efficiency and productivity for the spatially-aware enterprise. As any process manager will tell you, increasing efficiency and productivity is good, really good, because it means you can do more for less.

Figure 2. Silos are changed into building blocks for higher valued industries, increasing efficiency of productivity and resource management.

A recent article from Geoff Zeiss (who was building upon a 2004 article by Paul Teicholz) used the construction industry as an example of the impact of information silos. He first points out the size of the construction industry, worldwide = $2.3 trillion, US = $1.2 trillion. That’s trillion with a T.

Paul’s article examines a decline in construction productivity, during a period when all other industries were looking at increases in productivity (Figure 3). Paul points to a lack of IT integration and R&D by the building industry as a reason for this real fall in productivity, while all other non-farm industries appear to have used IT to become more productive. Geoff goes farther (and I tend to agree with him) that part of the problem relates to the ‘Islands of Information’ that are created, and not shared, by the various disciplines involved with the construction industry:

Disciplines such as architecture, structural engineering, construction, civil engineering, and GIS are classic information silos. Each maintains its own information island comprised of design applications and data. This has created a nightmare for operations and maintenance, emergency planners and responders, urban planners, and others who require seamless access to urban terrain including building interiors and exteriors, roads and highways, and above ground and underground utilities. The biggest challenge is not typically data, because the data that would help these folks already exists because much of (sic) it is created when buildings and infrastructure were designed. The biggest challenge is that islands of information and technology make it difficult to integrate existing data in a seamless view.


Figure 3. Labor productivity declines 1964-2003. (from ACEbytes Viewpoint #4)

WeoGeo was started to specifically address the creating, sharing, and marketing of geospatial content that will help increase the productivity of spatially-aware industries. We have built an easy to use interface and system to rapidly list, host, discover, customize, and deliver value added geo-intelligence in a way that generates revenue for content providers, which will be affordable for content users. We are using a classic exchange mechanism to create a neomarket to “remake” the silos into “connections” between the islands of geospatial information (I know I am mixing metaphors, but I couldn’t help it. Sorry.)

Does it matter? Are there enough inefficiencies to be found that will translate into dollars to make a difference? Here is another quote from Geoff’s piece:

Several years ago the National Institute of Standards and Technology (NIST) commissioned a study on Interoperability to attempt to quantify the efficiency losses in the U.S. capital facilities industry… NIST estimated that in 2002 poor interoperability cost the US capital facilities industry $15.8 billion.

That leaves some room for improvement in efficiency. And this is just one spatially-aware industry. An increase in productivity in these industries will create a more efficient use of (natural) resources, which over time creates a positive feedback into the quality of operations (and life) for all those using planetary resources.

Storage, Background, Remote Sensing, Hyperspectral, Amazon, WeoGeo, geospatial, grid computing, WeoCEO, mapping, WeoGeo Server

Image Processing and Delivery using Virtual Computing on EC2

I posted last week about bandwidth issues associated with geospatial data and our AWS S3 solution. The deciding factor for us to use Amazon’s offerings was not necessarily the edge distribution capabilities of S3, but the synergy from combining S3 data storage and distribution with virtual computing capabilities of EC2. There are multiple issues in image processing that require a ton of memory space and CPU horsepower. In both Market and Server, we offer the following basic map distribution options to our map providers -

Geo Clipping (6 zoom levels, allowing for ~125 million possible selections per data set)
Spatial Resampling (4 levels)
Layer Resampling (depends on data)
Output File types (5 - JPEG, GeoTIFF, ENVI, ESRI BIL, ERDAS IMG)
Projections (5 - UTM, Transverse Mercator, Lambert Conic, Albers Equal, Geographic)
Datums (3 - WGS84, NAD 83, NAD 27)

These options result in millions of possible map variants, which preclude the storage of each variant for distribution. So processing power for conversion is critical; and this processing power needs to be connected to a large, web-addressable, temporary data storage array to house the unique variant that a map user has selected. Now for a true mapping marketplace, this infrastructure needs to support 100s to possibly 1000s of simultaneous map requests from the same base map like the 40 GB image in Figure 1. Doing our NeoMapping Market correctly requires the creation of enormous processing, storage, and bandwidth infrastructure.

Figure 1. 40 GB, 156 layer HyperSpectral Imagery (HSI) map listed on WeoGeo Market. (Click on image to go to the listing in the Market).

However, who could afford that infrastructure upfront? Our original estimates for acquiring base computation needs and placing them into a co-location facility were around $500K. While not a lot of money in the scale of today’s internet operations, it was big for us. In addition, we were trying to develop the software architecture to support the Market and Server, and these expenses were large in it of themselves. AWS provided a unique and simultaneous answer to many of our immediate storage, processing, and distribution needs.

Developing our infrastructure on the scalable AWS solution allows us to say we can support the 1000s of map requests required for a functioning digital marketplace. The user experience is vital to the service’s credibility and therefore our success. However, there is a true (and in a number of cases unexpectedly high) cost in this decision. We traded high capital expenditures for high operating expenditures. In an upcoming post, I’ll talk about the Total Cost of Operations (TCO) on AWS, and some of the ways we are moving to reduce these high operating expenses through stability and scaling solutions. Some of these solutions we have turned into products that we provide to others (e.g WeoCEO)..

I would be interested in hearing about the actual experience of others on AWS and whether S3 and EC2 could or could not meet their needs.

Storage, Background, Amazon, FERI, mapping, WeoGeo Server

How do you deliver 100 40GB imagery files?

This is a bit tougher than the solution discussed in this earlier post. When we (FERI) first started developing HSI sensors and flying them for others, the distribution of imagery data was mainly through DVDs. As the research groups got larger, we started getting more and more requests for data. This eventually led to the WeoGeo Server solution, which allows for customization and asynchronous delivery.

However, 100 40GB files that look like Figure 2 in my HSI post means 4TB of data through our lab’s pipe in a relatively short period of time. Our bandwidth at the time we were trying to develop these solutions was a dedicated T1, or 1.5 mbits per second. To transfer 4 TBs of imagery files with full access of our pipe would require 259 days.

Clearly there are some solutions these days that would have helped this type of large file distribution effort. Akamai, Limelight Networks, or some bittorrent solution would provide capabilities to deliver large files over distributed networks. However, we were also providing search and customization solutions, which required modification of the data before delivery. This meant that we had a scalability problem in processing as well as delivery. Edge distribution solutions would solve one part of our problem, but not necessarily the processing part.

We began to explore co-location solutions, but these seemed to require a lot of upfront costs, as well as travel and maintenance expenses. As a small business, those capital expenditures were more than we could absorb. It was at this point that we were introduced to Amazon Web Services by a former co-worker who had been recruited by Amazon. AWS allowed us to build a distribution of large data files on top of a very large pipe via S3. (I’ll discuss the processing using EC2 later). It provided us scalable distribution at reasonable cost for those 100 40GB files.

To be honest, there are some devils in the details in using S3 for our operations. But (to date), the service has been more valuable than costly. The rapid ingestion of large files into S3 is a current problem that we are trying to solve. Moving forward we hope to build on the expansion of S3 as Amazon develops more physical data storage locations. This will provide us with some of the edge distribution advantages of the above solutions, while keeping us connected with our virtual computing solutions on EC2.

I’m also curious to see how others are using S3 in geospatial solutions; if you have a unique one, please let me know.

Background, Remote Sensing, Hyperspectral, WeoGeo, FERI, mapping, BigTIFF

What file format do you use for a 40GB image? (BigTIFF!)

Large imagery files are a problem. In the hyperspectral world, we send things via ENVI’s file format (BSQ, BIL, or BIP). ENVI was designed by folks doing HSI remote sensing and was optimized to easily handle large raster images. The use of this file format allows us to deliver extremely large raster files, with a separate header that described all the channels, bands, or layers in the image.

Unfortunately, not everyone owns a copy of ENVI. It is an expensive image processing package. While other remote sensing and GIS packages claimed to handle multi-band imagery data, we found that support for imagery with bands n > 3 was difficult at best. So if our customers at FERI didn’t have ENVI, the transport of the imagery had to be accomplished in another file format. The most common format other than the ENVI format for us was GeoTIFF.

Unfortunately, the GeoTIFF format is limited to 4 GB. This is clearly problematic for the image shown here in Figure 1.

Figure 1. HSI imagery of St. Joseph Bay, FL (click on the image to see the data set at WeoGeo Market.)

This image is 156 band hyperspectral mosaic. The entire image at is native spatial resolution equals 40 GB in size. Cutting this data into 10 tiles of 4 GB a piece would be one way to deliver this data set. But this is problematic for both us and the receiver of the images, as the time, energy, and effort to tile and then re-mosaic is less than efficient.

You could also say that for the most part that HSI data is a relatively small backwater of the remote sensing community, so why worry about it. To this I would respond with this imagery that we collected at the same time in Figure 2.

Figure 2. 3-Band DSS imagery of St. Joseph Bay, FL (click on the image to see the data set at WeoGeo Market.)

This is a 3-band RGB from an Applanix DSS. The resolution was about 1/6 the spatial resolution of the HSI sensor. The higher spatial resolution makes this image nearly as large as the HSI image. We actually incurred the pain of tiling the full image set for our original customer because they had only ESRI software with which to analysis this image.

Our friends at GDAL asked us about sponsoring a new file format, BigTIFF, which would be based on extending the TIFF format. We were happy to step up to help make this happen. I believe that the other sponsors had similar file storage and distribution issues, and we look forward to broad acceptance of this file format.

It will certainly make our distribution issues easier.

Next »