Source for Socioeconomic Attainment Statistics by Sex

This is inspired by any number of threads over the years on difficulties with dating. One recurring theme being that folks are prone to mistakes like:

A) Looking at dating like ordering a car; they want a rigid set of pre-conditions that probably aren’t available as a package deal and aren’t truly essential to compatibility or happiness.
and
B) They suffer from “champagne taste and beer budget”. AKA “they’re a 6 looking for 7s who’re looking for 8s”.

That got me thinking about the interaction of those ideas, “assortative mating”, “assortative housing”, optimization of multivariate preference functions, and all the rest. Which leads to my question:

There are a bunch of obvious “Sears catalog” virtues like educational attainment, income, assets, debt, age, has child: yes/no, has ex-spouse: yes/no, has conviction: yes/no, etc.

Somebody who wants some filtration by A but doesn’t want to fall into the fallacy of B would probably like to know where they stand in the percentiles and where their target parameters fall in the percentiles of their target sex. The key insight being that for hetero-oriented folks the two numbers will probably be different.

e.g. The guy with the hard science PhD seeking a similar age woman also with a hard-science PhD. That man might be a 1 percentile guy whereas a woman like that is a 0.05 percentile gal. He might think “I’m just looking for the same thing I offer.” True, but not the relevant figure for his search. Guys like him outnumber gals like her 20:1. That’s the info he needs to make a rational filtration decision.

Even for homo-oriented folks it’s still useful to know just how rare your target demographic is. e.g. If the guy above is seeking guys he’s not batting outside his league by 20:1, but it’s still worth him knowing just how small his league is. And e.g. a similar woman seeking women would be playing in a vastly smaller league than her gay male counterpart.
The US census bureau seems to like to capture stuff on a household basis and their website seems like a mix of megabytes of raw data for PhDs armed with stats packages and a random assortment of pretty summary pictures for the rest of us.

Some of the matchmaking sites may offer this data, at least to paid subscribers. Which is not me since my most recent (and apparently final) date was 30 years ago. Subject also to the caveat that lots of their customers exaggerate a lot of what they self-report to these sites.
So … does anyone know of a good source for these sorts of stats broken down by sex? Does anybody know better how to dig into US Census bureau info? Other sources I haven’t thought of?

This is in GQ because the question is the availability of statistics. This isn’t an IMHO thread on how (not) to date or whether this whole approach is a bad idea practically or morally. Multivariate optimization is fun math. *If *you can get the data to seed the model.

Thanks in advance.

If I’m understanding the desired data correctly - age and sex breakdowns of income, educational attainment, and work industry, you can absolutely get this in the aggregate from publicly available US Census and ACS data.

At my job, we use this professionally, and do use stats packages and modeling around it, but you don’t actually need anything fancy. Everything is downloadable into csv’s / spreadsheets straight from the census.govsite by state, and you can aggregate whatever summary data you want from all the states into a master sheet with a little effort and spreadsheet use on your part.

Now as you point out, this data is at the household level, rather than an individual level…but for the aggregate statistics it seems like you want (ie it seems .002% of the population are females under the age of 40 with incomes above $70k and with a masters or above), it is a valid and useful source.

The above has data for all the states, but you can also break it down by smaller geographies like county or blockgroup if you’re interested in your local area.

When you first open these, you’ll notice they all have seemingly millions of columns - this is because the census has already aggregated a lot of these characteristics by sex and age brackets for you. There’s a key file if they all seem like random column names (e1002 and such).

Some examples of the types of data available you can slice and dice by:
Age
Income
Rent / Own status
Educational attainment
Industry of employment
Languages spoken
Race / Ethnicity

Jesus, some people have strange ideas about breaking the ice on a blind date.

Oh, and just thought I’d add - for publicly available sources of data, the ACS is the gold standard for this kind of thing.

There are better databases that get more granular or have more and different information, but they cost a lot (in the hundreds of thousands)…whereas with ACS anyone with a spreadsheet program can download and play with the data to their heart’s content.

One thing it does NOT have, of course, is any physical characteristics (height, weight, attractiveness), which I’m told can be important in the dating game. For those, I think you would want to look for some of the CDC data sets (although nobody with a publicly available data source will have attractiveness data). Okcupid, back before they were assimilated by the Borg and stopped doing clever / interesting things with their data, had a few posts on attractiveness and assortative mating by gender in their Oktrends blog that you might find interesting on that front.

Since dating is a perennially interesting topic to many here, I’m sure we’d be interested in hearing whatever you end up finding.