Question for data warehouse people

Get a Technical Writer and put it on an online knowledge base. There is nothing wrong with writing it in plain English. IMHO. Especially for your varied user base.

The OP mentioned that there is no common key between the different datasets. Without knowing what the data is all about, it’s hard to address this.

But, in the end, everything has a location and a Primary Key. GIS.

It is GIS if there is geographically specific data, but not all data is geographic in nature. GIS is really popular because the largest current use of big data are marketing-type applications where location and regional distribution is a primary characteristic, but it has no bearing on, say, data from particle collisions measured by experiements at the LHC or observation of potentially hazardous objects in solar orbit.

There is nothing wrong with documenting and marketing the data product in English, and that could be a part of the metadata associated with data sets, but English is almost never the best way to describe data that you intend to use to develop quantitative estimates or otherwise describe with statistics. All data will have key values or metrics, but not all data neatly fits into the relational database paradigm (where there are one or more metrics that can serve to index the entire dataset). Again, someone who actually works in or has experience with the industr(ies) of the user base will probably be best able to advise on how to structure databases or metadata for best use, and can take over the grunt level process of categorizing and organizing the data from the “management by committee” approach that the o.p. describes. The überlords can instead be given decisions like what PowerPoint template should be used to present the data or how to reformat their company logo to match their mission statement, and everybody can contribute in a manner commiserate with their aptitude and experience.

Stranger

You may be surprised on the different uses for GIS. It is often the common key. Even if the data moves. All data has a location. Sometimes, we look at it backwards.

Data follows what it is relevant to. Location is often the key.

Back in 1989, I used it to help map cable and electrical runs in large buildings. I actually tweaked that application to track cell phone towers. Cells phones where in their infancy. It was… Interesting.

Back then it was called AM/FM. Automated Mapping and Facilities Management. A precursor to GIS.

But I digress.

Anyway, it’s difficult to solve a DB, user interface/knowlege problem when we don’t know the user base or the data. GIS does solve a lot of issues because it has a common key for all data. Location. Start from that and move back as needed.

My suggestion to the OP, is an online knowledge base, and a tech writer (or two or five, we have no idea how big the OP’s company is). The committee, of course has to go.

No, I understand the applications of GIS to a vast array of applications; I’m just pointing out that not all applications are somehow keyed to geographic location or information.

A technical writer may be good for generating documentation, but they still need direction on how the data set is going to be used and what factors are critical in describing it. That is where someone with expertise in both how the data is going to be interpreted and manipulated, as well as someone who knows the specific industries of the end users, can help define and structure the data in a useable way.

Stranger

It may be that you do need a formal language. Some form of ontology definition language. The language of the moment is probably OWL. Again, any sort of rigorous formalism tends to focus the mind.

Beautiful. If withering contempt could kill you’d be 007. Thank you.

[QUOTE=JcWoman;19249228[ol]

[li]We are probably unclear on who the audience for the documentation is. Developers certainly. Possibly non-technical analysts as well, although I’m not entirely sure about that. More on that in a bit…[/li][li]The data is non-homogeneous, unkeyed and unindexed. That’s likely the causing the larger portion of the difficulty. While stored in our database, it is indexed. And when customers buy the data, they upload it into their database and index it in whatever way makes sense to them. After that, though, to USE the data, there are no keys to work from. It’s entirely matching each of the elements (or fields, cells or columns depending on your frame of reference) to a real world data element in order to find the precise data you need for that exact transaction.[/li][/ol]

More on the audience question: We know that our customers have their own development teams who work on uploading the data into their databases and create the programs that process it according to our processing documentation. **Strangely whenever there are questions or problems, we never hear from their developers directly. Their analysts and/or managers are the ones who pass along issues and questions to us. This is a very odd situation, I think. **
[/QUOTE]
Bolding mine.

IMO this is the key. Even if the specifics of this suggestion may be formalism overkill for the OP’s reality. The OP’s current solution lacks rigour because their current approach lacks rigour. Which they pay for in massively excessive customer support at silly (and unbillable) high costs to themselves.
Looking at these two items together …

When you write your tech docs in business committee-speak, it falls to BAs & business SMEs at the customer companies to try to fill in (“guess in” actually) the gaps & contradictions in your docs for their developers. And when they fail, their BA/SMEs are who gets tasked to call your BA/Tech support for understanding. The game of telephone becomes inevitable. Because you designed it into your product. (Considering your data and your docs as a single combined product set since either is useless without the other.)

The fact all your customers do the same thing is a clear sign of a major impedance mismatch between your actual documentation product and your actual customer. Not the customer who signs the purchase order; the customer who *uses *your docs and depends on them to get through their day.
Here’s a thought …

We used to create 4 layers of documentation for our gnarly data center scale framework stuff. Think SAP but industry-specific.

One documentation layer was for the customer devs who run in our framework and extend it with their code. One for our customer IT folks who install it, configure its guts, update it, and diagnose the failure messages. One for the customer power users who use the admin UI to configure it at the user level, and for our hand-holding staff that teach them what it can do and helps them decide which features to use when & why. And finally end-user documentation for how to use and the UI and understand the outputs under various common configurations.

Each set of docs was written by somebody from the corresponding level at our end. Our devs wrote the dev docs. Our IT folks wrote the IT docs. Our BA/trainer folks wrote the admin docs. And our testers / help desk folks wrote the end user docs.

A tech writer-type went over the stuff for basic English, consistency, formatting, and all that other tech writery goodness. Expensive. But when we were done we had very few impedance mismatches between docs & the audience for those docs. When we took calls on issues not addressed in the docs it was a big deal because it was so very, very rare.

You’re selling data, not very complex massively configurable software systems like we were. So you probably don’t want to directly copy our approach. But you can think about how our approach to our problem can be metaphorically applied to inform your approach to your problem.