How can I look up a UPC Barcode?

Because generally speaking, nobody like shipping their work out to the internet of others to freely scoop up and save them the man hours to build it themselves in a competitive business environment.

If Walmart offered up 1,000,000 SKU’s that they had properly upc coded and another company gets to save thousands of man hours on walmarts dime. So having your own supply chain and not sharing basically becomes part of the competitive process.

Walmart and other retailers don’t retag inventory generally. They make the supplier put a scannable UPC bar code on the product.

For Coke, it’s printed right on the product, for apparel hangtags are attached by the supplier, for toys it’s printed on the packaging, etc.

0 = standard UPC. The next 5 digits are the manufacturer number, the next five an item identifier. The last one is a check digit.
2 = items (typically weighed) that embed the price in the bar code.
3 = drugs.
4 = retailer-defined. The retailer can do anything they like after the 4. If you have a loyalty card with a barcode on, it likely begins with a 4.
5 = manufacturer coupon.
6, 7 = standard UPC. They have run out of 5 digit manufacturer numbers, so now use a leading 6 or 7 to create more.

There are several posts in this thread that suggest people may think that the terms UPC and SKU are interchangeable. They are not. For example, the statement above talks about SKUs then flips to UPCs, giving as an example a number (12345) that could be a valid SKU but is not valid as a UPC.

UPCs are put on the product by the manufacturer (although big retailers have some power to exert control of this over their vendors). They must follow UPC format rules. SKUs are defined by the retailer and can be pretty much any format, withe the probable exception that if they were the same format as UPCs, it would mess up the POS software.

Retailers cannot put their own “UPCs” on items they get from manufacturers. They can assign their own SKUs and put labels over the UPC if they wish. No supermarket would do this - they operate off UPCs. If a retailer covered up a UPC, then coupons that a manufacturer issues for that item would not be able to be matched against the UPC.

UPCs can be linked to SKUs. An item can be labelled with a UPC, but that links to a SKU for stock purposes. Many different UPCs can link to the same SKU. For example, a retailer may have an item that it gets from several different manufacturers (obviously, this does not apply to something like “coke”, where there is only one manufacturer). The retailer buys from the manufacturer giving the best deal at any one time. Each manufacturer will assign a different UPC, but the retailer links them all to the same SKU for inventory purposes.

Thanks, amerone, for giving us the official first-digit list. I knew someone on SDMB would know!

I don’t see the problem. Sharing the database means anyone who has a need to relate a product with a number can obtain one or the other. This should be a convenience to all.

What use would product coding be to another manufacturer? They can’t barcode their products with someone else’s company code anyway. But to a retailer or wholesaler, it’s useful information.

You don’t have to open up the company’s internal files to everyone. Just dump a sub-set of it to a public location.

The amount of data processing I am suggesting is so little it is laughably free. If a number is automatically generated and dumped to a master/public database periodically, and stored in a text file on a web site, the cost is so low as to be effectively zero. Of all the programs I have written, proposed or maintained in my professional lifetime, this has got to be the easiest of all, and the storage requirements? Infinitesimal by today’s standards. I simply don’t see where the big cost factor is.

And it seems that handling the requests by hand for a batch of numbers for a new or temporary vendor would be much more costly. Why not just announce that all the data they want is available 24/7 on a web site?

As an analogy, I once had to handle requests for logo files for a company I worked for from various art departments. Rather than handling each individually, I put all the forms of logos on a web site and gave out the address. Much less work for me.

Perhaps I’m missing something, but I just don’t see the downside or the security issue here.

As mentioned before, the UPCs will be put on the goods by the manufacturers, and they will be the same for all retailers. If Wal-Mart chooses to map the UPCs to SKUs, these SKUs will be Wal-Mart-specific and hence of little use to anyone else.

With respect to not sharing their supply chain information, Wal-Mart is part of 1SYNC which is dedicated to defining and sharing standards and data across the supply chain.

Excuse me for making no sense. A UPC can only hold a small number of digits. This means it can only differentiate between a rather small number of products. All cans of diet Pepsi in a six-pack have the same designator.

The new computer RFID (?) chips are now coming on line. Wal*Mart and the military are starting to require them from suppliers. They are dirt-cheap and can hold a whole mess more information. When you scan them, they do not tell you “I am a can of diet Pepsi from a six-pack,” they have enough omph to tell you “I am a can of Pepsi produced at X plant on Y Day on Line Z at Time A.” Each and every can of Pepsi (and everything else under heaven and earth) can have its own unique identification. Further, since the information is carried on the chip, you can read it without using a database as a sort of Captain Crunch decoder ring.

Better?

While this is true, Wal-Mart requires them on the units shipped through the supply chain, meaning there is one chip per case or pallet. It is still not cost-justifiable to put an RFID chip on every can of Pepsi. RFID tags currently run 7 - 15 cents each.

That information can indeed be on the chips used in the supply chain. When chips replace bar codes to identify individual items, there will still be a lookup from a database. There is information that changes too frequently to be encoded in the RFID chip at time of manufacture - price is the most obvious. There is also information that varies by geography and which therefore cannot be encoded at time of manufacture when the selling location is not known. Examples of this are taxability and food stamp eligibility, which vary by state.

Plus at the moment the cheap RFID chips carry only 96 bits of information. The database carries one heck of a lot more information than that. 96 bits will not even hold the descriptor printed on the receipt.

I’m not sure there are any real problems with making this stuff open. The only thing I can imagine is counterfeiters possibly using it to their advantage.

It’s not a “big cost factor”, but there is a cost, it’s not zero.

Storing the data in a text file has potential problems. We have 600,000 UPC’s and we are what you would call a “medium” size company. Do we send the entire thing every night or just new ones and changes? Just a text file of UPC, SKU, Text Description, Color Code, Color Description, Size Code, would probably be about 60meg
I listed various interface issues previously, here are more, assuming a simple FTP interface:

Will you ftp the data to the destination server from a production server? If so, it is pretty much guaranteed to be inside the firewall, so now you need to get approval for that box to open a connection to a specific destination server for FTP. Once you get approval, then you need to have the security guys do the configuration.

If you are going to FTP from a box outside the firewall, then you need to get the data there first and then have a 2nd process that performs the FTP.

These are things I have done multiple times for this exact activity, typically SKU/UPC data to Dot Com companies because they typically rely on ftp instead of EDI.

The cost is not zero.

So, given that the cost is not zero (you’ll have to trust me on this one) I ask you, what is the business gaining that would compel them to do this?

When I have a list of 500 open requests/projects of varying magnitude that all return the company something in terms of dollars saved (reduced labor, reduced shipping costs, reduced square footage required in the DC’s, faster turn around time on new designs, etc.) what is the gain to the company that would move that request up the list of priorities?

The “temporary” customer that doesn’t use QRS but wants upc’s doesn’t happen often enough and many times the work can be off-loaded from IT to customer service or sales by manually exporting inventory related reports and sending to the customer. It then becomes the customers problem.

If there was a free website with standard interfaces that the retail industry agreed to use then suppliers would drop QRS in a heartbeat and do exactly what you are saying because QRS costs money for every access by customers.

But today there isn’t a standard internet database with QRS like functionality and up-time, etc.

Even with RFID, the long term goal (of GS1 and industry) is to have a lookup service for additional SKU information similar to what musicat wants but geared towards the needs of retailers and suppliers. For example: This would allow software to read the RFID at receiving and then retrieve additional info automatically that might be required for processing that item (e.g. suggested retail price)

But, again, these types of services require membership in the organization and are not geared towards free consumer access.

RaftPeople, all the problems you pose are valid concerns. That are easy to solve and have been done by many people long ago. They are Introduction to Computer 101 level. I was handling inventories and databases of comparable size 20 years ago for multiple locations. It’s not a mystery how it’s done and there are many, many different solutions. It’s not even high-tech compared to some problems – anything to do with text only is a tiny data size compared with images, audio and video. Every day, warehouses, medical establishments, banks, financial institutions, POS registers, JIT manufacturers, etc., etc., exchange data of exactly this nature and in much greater quantities and some with high security.

If the UPC establishment sees fit to keep some data semi-proprietary, perhaps they have their reasons, but the ones you suggest do not seem to be sufficient, IMHO.

Paul in Saudi, New Computer Chip Thingee = RFID. Got it. It’s easy when you know the code. :slight_smile:

But right now, RFIDs are costly compared to printed barcodes, and relegated to uses where the cost is not a factor, like pallets. RFIDs are also easier to read from a distance or from inside packaging. But no matter how much data they can hold individually, there will still be a use for a database lookup. An external database has these advantages: it can change after the RFID has been configured or installed, and it can contain almost unlimited amounts of information.

So it looks like the answer to the OP is: some UPCs can be looked up, but not all.

No there is no mystery, yes all of those data transfer take place.

I think your missing the key point: the cost is not zero.

They may not seem sufficient to you, but I assure you that myself and everyone of my counterparts at other companies use the following formula to decide these thing:

  1. What is the benefit to the company?
  2. What is the cost to the company?

Even if we pretended that there was no cost whatsoever, no human time involved in coordinating or executing the process, I ask you again what do you perceive the benefit to the company to be?

Umh…goodwill? Cost reduction, since they don’t have to hand out data to suppliers?

Goodwill wold need to be quantified. goodwill with the consumer? or with the customer (i.e. retail chain)

Cost reduction:

  1. We are the supplier, we’re handing them out to the customer
  2. If there were a free universal database with all of the required functionality including up-time requirements then every one would drop the paid services in a heartbeat. But these things cost money to setup, organize, maintain, etc. so it’s not likely to happen for free.

Goodwill for anybody.

RaftPeople, as I write this, it is 2008. Costs for simple, even big, databases, are dirt cheap. The cost to write a program to post data on a web site periodically is dirt cheap. The cost to administer a web site is dirt cheap. Web site technology is pretty mature and data access is easy and dirt cheap. Every two-bit company in the world has a friggin’ web site, it’s so friggin’ cheap. Putting data on the web for public access is such a trivial exercise that hobbyists do it on weekends just for the fun of it even if the data is of limited use to the average Joe.

I myself have about 10GB of video posted to my $100/year website. Since I get 25GB of storage for that price, the 10GB is essentially free. I could post another 10GB tomorrow with no increase in cost. And I only use 5% of my allocated maximum monthly bandwidth for customer downloads.

Let’s give you an example. My county has a GPS website with aerial photos and maps of the entire area keyed to tax data. For the entire county, every land parcel, every owner, every inch. It is kept current within 2 months. Since it has graphics (the photos and maps), it is a little larger than a text-only database would be. It is available to the world, free, and is maintained as a part-time job by one dude in the IT department. It is hosted on the government servers. My WAG is the storage is around 10GB or less. (The complete tax data alone, uncompressed, is only 20MB. I know because I have a copy of the entire database in my home computer.)

I honestly don’t know what the cost is to the County (it was a software package they purchased), but our County is notoriously tight with money (good thing, it comes from the taxpayers) and this software was around for many years before they jumped on it. They are certainly not early adopters with money to throw around.

Why do they do this? For the convenience of anyone who needs or wants to use the data. They don’t care who. There is no cost to them to prepare the tax database, as it is maintained for tax reasons already. The photos are seriously downgraded versions of a project done 10 years ago for primarily geological reasons. The maps are part of geographic records that have been converted to digital. Nothing is posted to the web site that isn’t already part of a database somewhere else.

And it reduces the telephone inquiries, since all the data is available online. We Realtors used to make frequent calls to the Real Property Listing Dept. or even pay hundreds of dollars per month for access to a clunky, teletype, non-graphic database. No more. And the trend is towards more online data…they are scanning surveys and deeds into the system and gradually making those available, although there might be some charges for those because the state mandates it.

I doubt if 1% of the county’s taxpayers even know this site exists, let alone actually use it (it’s not good thru dialup, which most residents have). Yet the county pays for it because the cost/benefit ratio is so advantageous.

I haven’t tried every county in every state, not even in this one, but when I have occasion to lookup similar data elsewhere, it seems other governments are doing much the same thing. We’re not unique, but typical.

So I see no reason why it can’t be done for a UPC number vs. description project. Heck, give me the 600,000 records in CSV form and I’ll post it to a corner of my website in the next few hours as HTML. I’m sure somebody will find it useful.

The goal of a company is to make money for it’s shareholders. If the goodwill helps do that, great, but to my knowledge there is not enough demand for something like this to be on the radar even (this is the first time I’ve ever heard of it requested).

I think you are mis-interpreting my use of the term “cost”. Human labor is the single biggest factor for something like this. Remember, I’ve written these exact interfaces (UPC/SKU feeds) numerous times, the most recent was about 18 months ago at the previous company I worked for and it was one of the ftp interfaces to a dot com style company. I’ve also created the EDI version of the same thing multiple times as well as alternate styles where this type of data was fed to an outside web-based OLAP processor.

So, lets review what the person doing this interface must do:

  1. Locate the universal repository (this takes some research unless there is one that is so popular everyone just knows about it)
  2. Map data elements from internal database to external database. This will involve decision making, for example, not only do we sell apparel, but we sell all kinds of home products many of which you can’t tell what they are by the description (the retailers buyers know because they have a context, they work within a department, but average joe looking at zillions of products will not know). So you need to figure out how to handle those situations, probably by some category, product class or merchandizing code. This takes time to analyze the exceptions and figure them out.
  3. You need to design an export mechanism that will find new and/or changed items. Is there a reliable data element in the system you can use? Typically yes, but not always.
  4. You need a delivery mechanism, EDI is too expensive (labor to setup and VAN charges) so you probably want to use ftp. But for this database to be valuable, the delivery of the data must be guaranteed, or errors detectable and recoverable. Interfaces that don’t do this are not very valuable. So you have to either purchase software to assist with this, or you write something yourself (I have typically written stuff myself in this area, but AS2 would certainly work).
  5. You need to deal with the security issues related to transmitting data outside the corporate firewall.
  6. You need to schedule this process to run nightly, which means you also need to find an available slot where there is no conflict with other processes
  7. You need to establish a recovery procedure if it fails. The best approach is to design the interface so that subsequent runs automatically pick up any previously un-acknowledged transmissions.

None of that is rocket surgery.

All of that takes time.

Let’s pretend it takes 8 hours to get through all of that and then you can walk away from a robust interface that needs little to no further attention. When you compare that 8 hour project to every other item on the list, you will see that every other item has immediate tangible return on investment. For example, instead of finance keying into a spreadsheet every month to distribute certain costs across different divisions, you could have someone spend that 8 hours creating a report that does it automatically, thus saving labor every single month. But the 8 hours spent on the upc feed so consumers can lookup stuff doesn’t appear to return anything to the company.

The reason for existence of a government entity is very different than that of a company. Their goals are different and their decision making process are different.

I completely agree, I don’t see why it can’t be done either. Please note (I really feel like I’m repeating myself on this point), not only can it be done, I personally have done it numerous times. The technical aspects are clearly not a problem otherwise I (and others) would not have been able to accomplish this already.

However, few, if any, companies are going to start transmitting this data to a second database (other than the customer required services already in use) unless the equation changes. There just isn’t any demand (that companies are aware of) and even if there were demand, there doesn’t appear to be any gain.

The retailers want a central source so they don’t need to receive 832’s from every supplier, they want to go to 1 place. So they require the supplier to use services like QRS.

The suppliers will send UPC’s to the services the retailers require (e.g. QRS), but sending to an additional database not required by the retailers cost money (takes time) for no apparent gain.

Then we’re in perfect agreement. :slight_smile:

You realize this is GQ not GD, right?

I provided you with factual answers to your question, I thought you would find it helpful.

I took the time to explain to you why that activity has a cost associated with it and why most companies would not engage in that activity. I included detailed items showing you the very real time involved in completing that mini-project and that there appears to be no benefit to the company. Yet you seemed to continually ignore that information. I’m not sure why. I thought you were asking a question because you wanted information, but it feels like you want to debate the reasoning, or that you are frustrated that a small effort by various companies would benefit you yet they refuse to take it upon themselves to do it.

Setting up a large database as simple as a UPC code database would be fairly easy, and not huge amounts of time.

But maintaining it would be. There will always be issues that need adjustments and some kind of attention from human beings, both on the database maintainer end and on the manufacturer end. And, yes, I do have personal experience with just that kind of database.
Something will always need attention: a manufacturer went out of business or merged, and now the manufacturer code is being used by someone else, or some manufacturer updated their software which broke some obscure part of the data interchange protocol and tied up the whole system, or UPC’s are being reused, or something. Most of these are easy enough to fix, but it takes time to run down the issue and realize what’s going on.

In fact, I’ll say that there are enough potential issues that no database run by volunteers - no matter how dedicated – will be accurate enough for a business to rely on.

So, this free database would be useless for manufacturer’s customers. Therefore there’s no benefit to the manufacturer, and there would be a real ongoing cost in staff time.

Here is a free global barcode database: www.upcscavenger.com

The site allows users to add custom entries for UPC, EAN, and ISBN products. The database is naturally grown by performing searches on product web APIs. There are currently over 1 million barcode entries available to search on. Imports are performed against public Media Wiki sites to supplement additional background information. You can also generate QRCodes for various purposes. Registering grants the user permission to add media content and edit pages.

I started writing the web site around six months ago. It is written entirely in PHP with a mysql database. I did not use a library to generate the site, and sub-sequentially still have pages of to-do lists on small tasks and larger enhancements alike. I have started on an android application which should allow for a more interactive experience in regards to being able to scan bar codes on the fly.

Here is a breakdown of how the site’s search functionality works. Feel free to comment.

Each bar code source provider is a separate database table. For example, an “Amazon” table in the bar code database holds all affiliate API records for amazon.com. All wiki pages are in a single “Page” table in the wiki database. This includes user profile pages, bar code specifications, and general MediaWiki page imports.

Each record is saved with a set of tags comprised of soundex codes which are generated from the top 100 words that are present in all of the record’s fields. For products this is mainly derived from it’s description but also applies to individual fields such as model and brand. For wiki pages the tags are mainly derived from it’s wiki content, an excerpt of the generated wiki, and it’s related pages.

Bar codes tables that do not generate new content automatically are only retrievable by referencing there bar code. The majority of the tables will perform a search against the soundex codes generated from the user’s search keywords and the tags field of each record. The most recently updated records are returned first and 1000 records per table is returned for each keyword. The final results are then sorted using a levenshtein algorithm. Results are cached 2 days for each keyword and source.

Each search performed will generate inline “img” links which initiate an import from various web API sources through local PHP scripts. This ensures the imports are user generated as opposed to a search engine or other bot. Sources which have a low frequency of return data have a longer delay between imports. Keywords which are already cached, and tables which already have a large number of results for that keyword, will not initiate an import.