Making a 6-digit numeric hash from a string?

The big problem here is that something that should be trivial (generating unique indices for a set of records) is made impossible by a proprietary library software that uses its own database format. I have to work within that constraint until and unless we migrate to better library software (which definitely exist, but are much harder to use).

So in the meantime, I can’t trust our organization (a tiny nonprofit) to have the infrastructure and institutional memory to properly keep track of sequential unique IDs in a separate database.

The only other option I can think of – and you’re welcome to propose better solutions if you know of any – is using a timestamp for the ID. Unix time seems to be 10 digits for the next few centuries so that might work, but the ID# generation will happen in batches and over milliseconds, so I’d need to figure out how to get more precision than that while maintaining a fixed number of digits (10 or less) to maintain EAN-13 barcode compatibility.

Anyway, I think the hashes, while imperfect, seem like the least problematic at this point…

Timestamps also wouldn’t get you the same number for a title that was entered into the system twice.

Oddly, I discovered tries (that looks weird… what do you call more than one trie?) on my own after finding myself dissatisfied with hashes. I had a college course which had a little competition to see who could write the fastest program do generate a word histogram from a text. Came up with the idea of a tree with one node per letter. The final alphabetic sort was trivial. Placed second out of a few hundred students.

Anyway, I’ve used hashes all the time without collision detection. Obviously it only works under certain circumstances, but there’s no problem with the concept in general, even in situations where a lot is on the line.

right, there is no need to limit barcode to ISBN, EAN barcode can also scanned by fake barcode reader utility in vb.net.

A year and a half late, but …

How will (did) you ensure the title is entered identically every time?

A Story About A Dog Vol 2. --> 320812
A Story About A Dog, Vol 2. --> 563827
A Story About A Dog Vol 2 --> 478901

I stripped whitespaces, punctuation, and capitalization so that all three became “astoryaboutadogvol2”. It’s not perfect – “Story About A Dog, A” would be wrongly separated, for example – but close enough. Most of the books had ISBNs anyway, so the title was only a last resort.

Edit: Actually, looking at my code above, it looks like I DIDN’T do that but should’ve.

thanks for sharing

In your case, a simply linear barcode will do, such as ISBN and EAN-13, etc, which all supports encoding data with a specified data length. Just google online and you will plenty of barcode software to do that.

That’s exactly what we did. Hashed titles into ISBN-compatible barcodes.