Is there a general name for computer data text files organized like this?

I’m using .kml and .gpx files, which are data files for GPS tracks such as a hiker records. They are human readable text, but they are formatted in similar distinctive ways. The structure is nested with these <name> and </name> delimiters containing predictable blocks between them, like an ISO 8160 timestamp and a set of coordinates. Is there a name for, generally, structuring a data file like this, with this form of start-and-stop delimiters?

Here’s a .kml example, the first few lines of a file:

<?xml version=“1.0” encoding=“UTF-8”?>
<kml xmlns=“http://www.opengis.net/kml/2.2” xmlns:gx=“http://www.google.com/kml/ext/2.2”>
<ExtendedData>
<Data name=“GPSModelName”>
<value>Dual XGPS160</value>
</Data>
</ExtendedData>
<Placemark>
<gx:Track>
<altitudeMode>absolute</altitudeMode>
<when>2017-02-04T18:30:31Z</when><gx:coord>-75.809845 39.722271 41.1</gx:coord>
<when>2017-02-04T18:30:32Z</when><gx:coord>-75.809814 39.722229 44.2</gx:coord>
<when>2017-02-04T18:30:33Z</when><gx:coord>-75.809723 39.722229 47.2</gx:coord>

XML…first line is what says it is so, and the version and encoding.

Were you looking for something more specific to your particular file?

More generally this is an example of a structured data document type - XML is the most common example but hardly the only one. Here is a pretty decent primer on XML structure/function.

Hey, thanks! Just what I needed to know! Not looking for something more specific about this file. I wanted to find out more about pre-existing tools and code snippets for extracting information from the files, and now I have an angle to work. Thanks!

Glad to help. Now that you know what to search for you will find loads of different libraries and tools for dealing with these files.

XML is a pain in the ass. But it has one advantage: it’s very universal, which means there’s a ton of tooling available for working with it. It also has a disadvantage: it’s very universal, which means there’s a ton of tooling available for working with it.

If you get to choose which format you are working with you might want to consider JSON instead. It has less overhead (in terms of amount wasted characters) and generally the parsers are easier to use.

I don’t get to choose what format to work with. It’s really one very specific thing I need to fix. I’m recording hiking trails, and for a while I used a Dual XGPS160 GPS receiver and logger. Unfortunately one version of its accompanying phone app records my longitude, which is really around -75.5, as around -283.5. This is just a flat out illegal value. I have about 150 miles of recordings written this way. It looks like I could change the sign and subtract from 360, or some similar formula, and recover the tracks. I’m just trying to figure out how to do that. Knowing this is XML may let me use existing tools to handle most of the logic to break each file down, change this one value, and build a new one back up that is otherwise the same.

While I’m at it I’d like to extract ALL the coordinates, from these bad files as well as all my good ones. I wrote, years ago, code that can write a .dxf file that Autocad can import, based on a series of points. I may create my own map and have control over all the details. And there’s a park ranger eager to work with me to use the data in their own next generation map, so it’d be nice to be able to clean the data and deliver it in a way they can use well.

Thanks, everybody!

One assumes this should be 284.5 and the silly thing has wrapped you around the globe - and got it wrong.

Code up your stuff in Python and you will have access to enough pre-baked libraries that you could do the entire data massaging task in only a few tens of lines of code.

Similarly, I have a back-burner project that would involve processing SVG files, and I was sort of dreading having to figure out how to manage them. Then, for a different project, I was following some directions I found online and ended up opening an SVG in TextEdit, and discovered that they’re just human-readable XML, too. That’s not nearly so intimidating any more.

Yes, there are a lot of data files out there that are human readable. I like this trend. We have too much important data at work that was stored in some dumb proprietary format that got marooned in a few years. I guess proprietary formats are useful in the short term, like letting a car idle, but it’s no way to leave things stored for the ages.

XML is a ubiquitous abomination.

Sometimes, even if you find a file that is not human-readable, it is secretly a zip file containing a compressed human-readable format.

I once had to reverse engineer a proprietary file format that we got from a vendor who had long since gone out of business. Turned out it was just a zip of (you guessed it) XML files. One the of XML files just contained a list of the other XML files.

Note, while XML has serious issues (mostly around security) that you can get the exact schema’s you need to decode those files.

http://www.opengis.net/kml/2.2

<element name=“when” type=“kml:dateTimeType”/>

But you will notice that the other schema is now a 404…and external public schema urls are very risky from a security perspective.

But for common formats there is almost always a good python module.

https://fastkml.readthedocs.io/en/latest/usage_guide.html#read-a-kml-file-string
In general json, xml and other documents like this are structured as directed graphs with strict arboricity or at least something that approaches it.

They are almost always just set of objects (vertices or nodes) in a tree structure.