GIS gurus, help me understand Shapefiles, and some tips

It seems I’ve gotten old, the last time I seriously used a GIS was about 15 years ago, and that was MapInfo and I used it for mapping out locations and that’s about it. Before that it was a DOS based system.

Recently I have wanted to do my own project for my local county. I installed GRASS and QGIS and downloaded some shapefiles from the local county government to play with. Except now I’m really confused. I’ve tried reading the Wiki page on shapefiles and don’t really understand it.

My problem is that when I open my data, for instance roads, all the roads are the same line weight and I can not find out how to change that. Even when I go to make my own shapefile I can seem to only have one type of point or line, if I try and change one they all change color or size. Gotten to be really frustrating.

When I started to read up on shapefiles, it almost seems that each file can be one, and only one type of feature, which really makes no sense. This can’t be correct is it? I can’t need one file for highways, one for state roads and another one for dirt roads can I?!

My project that I’d like to do is take an old county atlas, one done in the 1870s, and redo it digitally. The problem is that this atlas was not done by professional surveyors, but rather buy people walking along with a measuring wheel and taking people’s names who own the property. So while the map works, it’s not a great representation of the real world. I can however, warp the map and shift it to work well enough. Since a lot of the roads haven’t changed much it’s easy enough to put a point at an intersection. The point isn’t to have the new map be exact, but rather be able to look up a name and say, oh, near that intersection and go from there.

About a year ago I made an excel index of all the names in the atlas, thousands of them. I would have thought, and hoped, that I could import the spreadsheet, sans coordinates, select the name and place a point in a new shapefile. However, it seems that ArcInfo, which I have access to at work, doesn’t operate this way. Asking in some GIS forums didn’t help as they said I would need coordinates to start out with. Is there a way to create a shapefile without having and coordinates? I’d rather not have to retype in thousands of names. I have other spreadsheets where I do have some coordinates, but even if some are blank I can’t get the import to work.

This can’t be that difficult can it? I must be missing some sort of basic knowledge.

If you can post the shapefile somewhere, it would be easier for us to massage that into whatever you want.

But if I understand you right, you’re asking a few separate questions;

  1. Roads are typically represented as lines, not areas/polygons, so their size isn’t automagically visible unless it’s in the dataset somewhere. If you look at the shapefile’s associated data, perhaps the roads are categorized into “highway, rural, street, bike trail” or whatever, and you would come up with a set of visual rules yourself to best illustrate that in your own design.

  2. It sounds like the second question is that you have a bunch of lines floating around in unidentified space, but they are not georeferenced to an actual physical location. It’s best to have this identified in the metadata to begin with, but you can manually re-georeference them over a satellite map using QGIS’s Georeferencer function, but that will be less than ideal. If you don’t know which geoid, etc. the original data was created in, you will get projection errors over a large enough square area. Depending on what you’re trying to do with the map, that may or may not be a big deal. For something like a tourist map of a city, manual georeferencing should be good enough. For more detailed work, you don’t wanna butcher the (already limited) precision if you can help it.

Basically, GIS files exist in a really archaic data scheme from the last century, and a lot of the data and metadata is split up into separate files – in contrast to more modern formats like Google Earth’s .KMZ that plot a bunch of points for you with two clicks. So you usually have to read the associated readmes or metadata, and tell your GIS program what’s what, for the points to come up accurately.

Actually, I forgot that the shapefile will likely be vectors, so manually georeferencing will be more difficult. If you can identify the projection it was initially stored in, that would make the job easier.

I think you are missing some basic information.

A single shapefile will be 3 or more files. One file contains the geometry data (.shp), one contains all the other data (the attributes - .db) and there will be one that describes the projection (.prj, I think)

A shapefile will have one type of geometry: points, lines or polygons. The attribute data will describe if a line is a highway, bike path or whatever. The way the geometries are displayed is a function of your software, not the shapefile itself. Usually.

You can have a shapefile with no coordinates (with no .shp) but then it’s just simple a database. You really need the shape part to be able to display the data in a maplike way. It’s ESRI’s format for storing geometries.

This might be the problem, I just can’t see/figure out how to change that. I will see if I can post a couple of files, they are by the government and the licenses said no selling so it’s good.

Actually, I have a spreadsheet with a bunch of names in it. The names came from these types of maps. I originally did it just to have the names indexed. Later I decided to try and remake it in a GIS format because I’d like to do other things with it in the future. I was hoping to be able to bring the data into a table, select a name and say, “it’s here”. Because a lot of the roads from the 1800s were just paved over there hasn’t been a whole lot of change, so I can say, George was at this intersection, it might be off by a few hundred feet, but honestly it doesn’t matter, it gives a starting point.

These maps were not done like a real map would have been done, more like an artist’s rendition, they are kind of close, but not great, directions and distances were kind of good, but the maps need to be warped and shifted a lot.

What I did was take my roads shapefile, create a new shapefile from that, so the limits, geode and such were all the same. I know that with MapInfo one could take a table with just names and import it and then place a point and say here it is. The ones I’ve been playing with don’t want to seem to do that.

So you’re telling me I need a lot of beer then?

Sorry, I just re-read your question a few times and I think I understand it a bit better now.

  1. You have one or more shapefiles of contemporary roads from your county government
  2. You have an old map? Is this a GIS file or just a scanned picture?
  3. From the old map, you manually made a list of place names (without locations) in Excel.

If #2 is just a JPEG or similar, it would be easy to overlay it over an existing layer (instead of using your county’s shapefiles, you can just use the Google Maps plugin for QGIS for a nicer base layer). It would be as you described: In the QGIS georeference find the same intersections on both maps and mark them several times in several different places far enough apart. It’ll apply mathematical wizardry and warp the old map to fit your new projection.

#3 is the hard part. If it’s just a list of names like Main St, MLK Jr Drive, Oceanview Rd with no geographic data associated, there’s not a whole lot you can do with it besides copy and paste each label onto streets, or perhaps write a macro to let you point and click that easier without having to CTRL-C, CTRL-V a thousand times. But meh. Either way it’s gonna be a pain in the ass. It might be possible to try and match old street names with a list of the new (current names are probably in a data table from the road shapefile(s)), but I don’t know how that would work with road names over time – what happens if a road gets lengthened or renamed?

Yes.

This is a scanned picture, no GIS files from the 1870s I think. :slight_smile: I want to make one.

Yes, I have a list of all the names of people, churches, shops, etc.

I warp at work so I do have a good idea of how to do that, and because of the way the map was made it looks really, really bad, but it’s enough to work with. I didn’t know about QGIS’ plugin, I haven’t played with it that much.

No, it’s just a ton of points I have, if you looked at the map in my last it’s got lines for roads but they are not labeled. It’s very easy to know what road is what when you’re familiar with the area though.

My thought was I could:

  1. take the county road data
  2. warp, as best as possible, my old scanned map
  3. create a new file for the names by importing the excel file. I was hoping that I could have the table open, look on the map, select a name like Eddie Head, then say that point lies here. I started out by creating a new shapefile and redoing all the damn points, but then I remembered I have 6000+ names and started looking for an easier way to do it.

I sent you a PM with a link to the roads files if you’d like to look at them.

So here’s some basic work done on it:
screenshot and the zip file for qgis (oldmap.qgs)

What I’ve done is:

  1. Add a google maps baselayer for ease of use (you’ll need the openlayers plugin from inside qgis)
  2. Added your county shapefile
  3. Georeferenced the old map to fit the new
  4. Symbolized roads by type (ave, blvd, hwy, etc.) and labeled them by name

From there you’ll have to:

  1. Tinker with the visual presentation – transparencies, street types and colors/thicknesses, etc.
  2. Find some way to put your thousand old names onto the new – not sure if there’s a quick and easy way to do that, sorry

There is something called “geocoding” and many plugins available for it. It will look up placesnames in a given county/state and try to place them on the map for you by looking for them in geographic databases like Google’s. You can give that a shot – even a 80% success rate would save you a lot of time, I’d imagine.

The easiest way I can think to do the placenames shapefile, which will still require a lot of manual labor, would be to load the Excel file with pseudo-geographic data (say, XY coordinates of a point near the location being mapped, with each location offset into a grid pattern of dots) and manually move them around the map. It would be a fairly easy job, just very repetitive.

Of course, I have never used QGIS. Here at the office I use ESRI products, so I can’t be certain how easy any of it would be. Tonight I will be installing QGIS at home to check it out.

It’s doable in QGIS too, but equally repetitive. OP, you would add fake coords to the list of names (X,Y or long-lat of the center of town or something), add it to QGIS as a Delimited Text Layer, and then individually move points to where they should be.

If I were you, I’d just form a Buckeytown Historical Society or some such and recruit slaves/interns from a local university or community college. Or you can outsource it to Amazon Mechanical Turk.

Thanks, that’s pretty much what I was hoping to do. I was having a real hard time getting anything to label or anything like that. I really haven’t played around with QGIS at all. I will have to tonight.

Yes, I knew I’d have to play around with the presentation. I was really hoping not to have to retype names, some of them were really hard to figure out.

The manual labor isn’t a huge deal, it’s just typing the names again, hopefully I can find a way to just move the points easy.

Ha, the people that run the local historical societies will not see the benefit of such a project until a lot of it’s done. There are a lot of things I can see adding to this project, the atlas was a start since I’m familiar with it and it’s a good learning point for me.

So another question that I still don’t get I guess. In the shapefiles can I have points AND shapes in the same one? Do I need more parts to the file? For instance the other thing I’ve been working on is all the cemeteries in the county. There are some 300 of them, with 200 or so being very small, think 1-15 graves, so wouldn’t need more then a point. The larger ones I had thought to make shapes for them. Can I or can I not put all of this in to one shapefile? It seems so needlessly complex.

ETA: Thanks guys for you help.

It IS needlessly complex, and it’s terrible. Each shapefile can only have one type of shape – points, polygons, lines. But you can just add another “layer” and save that as a separate shapefile. Each project can contain multiple shapefiles on top of each other, with transparencies and z-indexes (which one’s on top of which) set by you. Each GIS project (or whatever the Arc term for that is) comprises multiple shape files, images, metadata, maybe data tables, csvs, whatever in one big clusterfuck of a folder – if you’re lucky. If you’re not lucky, the files will be spread across multiple files in folders that don’t exist on your computer, and you’d have to go in and relink them all. Ugh.

Basically ESRI has been doing it for a long time and they make really powerful and really buggy and really archaic software that dominates GIS and hinders innovation, especially when it comes to ease of use. Each version iterates upon the last, and the whole damned thing was put together piecemeal. It’s a surprise it works at all. QGIS, as an open source project, is leaps and bounds above Arc in usability, but it’s still tied to the same silly file formats for compatibility reasons.

If geo-processing isn’t required and you just need to draw lines and such on a map, you might even consider exporting a “good enough” product to Illustrator or similar and labeling/drawing stuff there. The strength of GIS is arguably spatial processing, not necessarily vector prettiness, so a proper vector drawing program may make your workflow easier… up to you.

If overlaying the old map isn’t important, I might also consider skipping GIS altogether and using something like Google My Maps (example here). All I did was import an Excel with 3 columns (name, long, lat) where the long lat were just fake numbers at the location. Then you can super easily drag all the pre-named points to where they’re supposed to be. And add additional points/roads/shapes, all on top of a premade, beautiful basemap (or satellite, if you prefer). And you can easily share it with anybody else so they can see it and even help you move or add points. All in far superior interface.

Every day I dream of Google buying ESRI =/

Another possible workflow would be with MAPublisher, which is a way to make Illustrator work like GIS. Basically, you’d plot your roads in GIS and export them to Illustrator. Then open the Illustrator file, Place a scan of the old map, and fit it to the modern roads. Now Paste in the column of names from Excel and Ungroup them until you have individual pieces of point text for each owner. Drag each one to the proper location. You can quit there if your only need was a more readable map, and save as a PDF. If you want to take the names back into GIS for some reason, the MAPublisher plugins make that possible, and now each name has a lat-long (well, more probably a state plane coordinate) associated with it.

At the mapmakers conference in Pittsburgh this Wednesday, a guy will be talking about another tool he’s developed, using Python, to turn Illustrator files into shapefiles. I’ll be listening intently.

All of the above is true. I just want to add that ArcGIS is gradually moving away from “shapefiles” as such, in favor of “feature classes,” which are identical in most respects, but a broader and more flexible file category.

If you have a shapefile or feature class with attribute data (e.g., names) within it, a standard practice is to “add labels” (autolabel, giving parameters like font size and offset), then convert these labels to an “annotation feature class” – essentially, a shapefile consisting of the text labels themselves – and then fiddle with those labels which look bad cartographically for one reason or another. Among other things, the advantage of this is you can give someone else the labels as a file, rather than it being a product of the temporary symbology settings within an mxd (work-in-progress project file).

And, Mr. Downtown, have fun at NACIS! Sorry I couldn’t make it this time.