It seems very, very difficult to find truly re-usable software. Is this true? Or is it just that I have not been very smart about looking? For example, I am interested in data. I have been working with a large dataset published by the California Dept of Health and Human Services. Or rather, it is around 580 datasets and they are all over the map, in terms of structure. Some of them are csv files, and some are spreadsheets. But there are actually several diffent kinds of spreadsheet files and all of them appear somewhere in these datasets. There is map data, which is sometimes described as “open data”, and I suppose that in some ways it is, but these mostly use ArcGIS, and this company is most decidely not interesting in sharing data in such a way that one does not need to buy a copy of ArcGIS. And the data types run the gamut as well. And you know, there is a trick where you can combine many tables into one. Say that you have 20 tables of 20 columns each. It is possible to rename the columns and combine them into a single table, and then a single csv file, with up to 400 columns. Well, one of the datasets in the Cal HHS system has a csv file with over 10,000 columns and the column names are huge and complicated. There might be several databases of many tables all combined into this one file and the office putting out this data can then say “open data?”, “check.”

But what can one do to make this data more re-usable? And how can one find other projects where people have done this? Well, I have no idea. Finding things on github seems to be a lot harder than it should be. Well, and there are data APIs and I should be able to use those also. Now, what the heck is my githib API authentication string and where the heck did I put it? O well. But I have figured out some things to make things more re-usable and I think that I can look for these things. For example, I have python scripts that use the openpyxl module to take apart the spreadsheets and I might look for references to this. But taking apart spreadsheets is not really what I want. It is just something I have to use. I was REST interfaces, perhaps ones where one can figure out the interfaces. I can perhaps look for artifacts of swagger interfaces. I could look for MCP interfaces. MCP is interesting. It is not interesting because AI code wants to find it. But if people with the money to run AI crap want to help get MCP interfaces put up, that would be fine. Really, MCP seems to be a way to have self-documenting interfaces to data. It should enable many things, such as agent software. You remember agent software, yes? It is sort of like AI that is actually doing things for you and some massive corporation with more dollars than sense. And so, agent stuff would be good.

Well, I have data from the California Legislature and I have begun putting a REST interface on that and I have started to put an MCP interface in front of that. And I will look around and see what I see. More to come.