Rendered at 23:43:41 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
efitz 2 days ago [-]
I have always called this the “one true taxonomy” problem, because whenever you sit with multiple stakeholders in a room talking about a taxonomy, you can never get to agreement, because there is no such thing as the “one true taxonomy”.
Any hierarchical taxonomy classifies on one dimension at each taxonomic level. Invariably someone wants to classify on one criteria when someone else wants to classify on another. Taxonomies that humans use aren’t multi-dimensional. So if there is a disagreement, someone wins and someone(s) has to lose.
No one is wrong; they just have different priorities or preferences or goals.
So now as an architect I never argue (and seldom discuss) taxonomies. I make two points and then bow out:
1. Whatever your taxonomy is, you need a rubric for each level. You need a procedure or set of questions that unambiguously map any $THING you encounter into exactly one bucket. Validate that competent people with no specific domain knowledge can properly classify things with your rubric; it must be repeatable by amateurs, not just experts (software is dumb).
2. Existence trumps theory. If there exists a taxonomy and rubric for what you’re classifying, you need to provide a $DARN_GOOD_REASON why this wheel needs reinventing. Personal preference and your 1% edge case probably don’t justify all the work to reinvent everything.
Then, I go back to the implementers and tell them to design in a tagging system, which is a DIY taxonomy, and except in ridiculous use cases, I can make indexes make it fast enough to let everyone overlay their own classification system.
ssivark 2 days ago [-]
> Then, I go back to the implementers and tell them to design in a tagging system, which is a DIY taxonomy, and except in ridiculous use cases, I can make indexes make it fast enough to let everyone overlay their own classification system.
This 100x! I wish this were more common.
The key property of a tree is that there a unique path (address) for each element, which is a useful property in the implementation layer. But forcing that on users is a horribly leaky abstraction.
Ideally separate the low-level implementation from the interface, and allow users their own way to address content. I imagine object storage (with UUIDs or whatever) is often good enough for the lower layer. For the interface layer, tags are an improvement on categories (tree structure), but I think there's also room for more innovation (fuzzy matching, AI-driven interfaces, etc) that start by allowing trading-off precision for recall but then allow regaining precision by adding more approximate qualifiers to the filtering.
----
PS: Pushing this approach to 11/10...
An intriguing (crazy?) application of this idea would be: what if we did this to the concept of a codebase? Make it a database (with all the corresponding improvements over a filesystem) -- it's no longer a tree of files, and allow users to query code like "that foo which accepts a bar, frobnicates its internal state, and emits a mutated baz". Note that this might also solve the "naming things" problem.
This setup seems like a powerful abstraction for AI coding agents. All that back-end power (database >> filesystem) is something they can easily leverage, and they can also be built to robustly resolve your fuzzy queries into precise addresses, and then update the code based on your desired outcome.
frogulis 1 days ago [-]
> that foo which accepts a bar, frobnicates its internal state, and emits a mutated baz
Tangential, but that reminds of the Haskell "hoogle" tool which allows searching for functions _by type_ across a large database of libraries, even by abstract types. So you might wonder "hmm what's that function that has a type structure like `t a -> (a -> t b) -> t b`?" and it'll happily tell you that it's monad `bind`
A [Code Property Graph](https://en.wikipedia.org/wiki/Code_property_graph) takes a codebase and turns it into three graph representations: it's AST, a Control Flow Graph, and Program Dependence Graphs. These graphs are overlaid and shoved into a single property graph. It's a structure mainly used by some static analysis tools like [Joern](https://joern.io/)
---
This has been a topic of a lot of interest and research for me. I've been experimenting with figuring out a system inspired by these ideas, among others, to apply the same idea (shoving multiple graph representations together) to a broader set of information
m3047 1 days ago [-]
Tag Clouds were sticky tasty web goodness a few years back.
I've got a legacy tag cloud curation tool for random collections (each collection gets its own id) of URLs. It IFRAMEs each URL to present it; no whining. I've used it for classifying technical docs, photo libraries (then I used the tags to train an image classifier), and to present an analysis of a customer's web site.
It's written in Perl, and (still) runs on modern Perl. Make friends and maybe I'll toss the code your way and help you with your project.
weinzierl 2 days ago [-]
More than once I encountered a project lead (often a higher-up) spend a half day after the project kick-off to create an elaborate folder structure for the team.
Younger me wondered: "Don't they have more important things to do? Why do they never delegate such a menial and boring task, especially when the structure is kind of obvious"
Today it makes total sense to me.
Even if it looks obvious no one has the exact same hierarchy in mind.
It was fast for them to materialize the hierarchy themselves than to convince anyone about it in every detail. Some things just can't be delegated.
designerarvid 1 days ago [-]
That’s a good name! I often resort to “there are many ways to slice a cake”, less sophisticated blunt gets the point across.
chaboud 2 days ago [-]
The problem with trees is that the are a dimensional reduction, an aggregation; taking a problem without directionality and applying a useful/functional hierarchy.
So a tree is a way to take a high dimensionality graph and make it usefully lower dimensionality, but, given the aforementioned proof, that reduction is going to go from being a lossless compression to a heuristic. So any interesting problem (at least, any problem interesting to me) is only going to be aided (read: not solved exhaustively) by that hierarchy.
I'm okay with this. Being okay with this has been one of the most freeing things over the last 20 years of my career. Accept inaccuracy, and find usefulness in your data structures.
sdwr 1 days ago [-]
The map is not the territory!
panstromek 2 days ago [-]
This is good diagnosis of the problem. I found that often the right solution is to ditch the hierarchy and use flat structure. I wrote about this some time ago: https://yoyo-code.com/embrace-flatness/
I think all three problems are really one problem under the hood:
Are these two things actually the same thing, or they separate?
tikhonj 2 days ago [-]
Reminds me of my favorite math essay: "When is one thing equal to some other thing?"
It's a great question, much deeper and more interesting than it seems. The essay suggests thinking in terms of isomorphisms (relative to the structure you care about) rather than equality in some absolute sense, and I've found a fuzzy version of that to be a really useful perspective even in areas that can't be fully formalized.
Mathematics is all just explanations for why this is really that. If it didn't have to respect its human audience, and their failure to grasp similarities, the whole edifice could be one implicit statement. (After all, since this is really that, there is no this or that.) So mathematics is about people.
moi2388 10 hours ago [-]
Yes, I found the same and was very pleasantly surprised when I first learned about the ideas of cubical type theory.
hackthemack 2 days ago [-]
I jumped to a similar conclusion right away and popped over here to comment only to find you have beaten me to the punch. I use to keep a work wiki page of common problems the team encounters over and over again.
Years ago, I stumbled upon the "idea" was already debated in other fields long before programming. Lumpers and Splitters.
Thank you for sharing, first I've heard of this tree type.
gobdovan 2 days ago [-]
I have a deep distrust of hierarchies, because they keep you trapped into a single model that keeps extending its authority, usuall without anyone explicitly deciding that it should do so. For example, the file system: once it was deemed hierarchy is the main metaphor for navigation, the structure persisted and was reused for organisation, ownership, access control and governance. And it became infrastructure we cannot easily remove before we could even question if it was right or not. And once it dominated, non-hierarchical things were retrofitted as glue, e.g.: symlinks, aliases, shortcuts... also, when's the last time you've used a tag?
The webs are so much more malleable, but they're also not free. All the 'good enough's you were that a hierarchy that was taking care of implicitly are now your responsibility to model precisely and make sure they're performant as well. Look at ReBAC, for example. It gives you expressive power, but it also forces you to reason precisely about relationships, graph traversal, consistency, and cost. Strikingly similar is GraphQL.
Interestingly, source code is hierarchical, but compiles almost immediately to a graph IR and most analysis and optimisations happen there. But almost nobody looks at a CFG/SSA graph directly. You author in a hierarchical manner, yet the operational substrate is a malleable graph.
delusional 2 days ago [-]
> before we could even question if it was right or not
But we could. Early filesystems didn't have heirachy. They'd sometimes support a single level of organization by treating a prefix of the filename as directory, but nothing below that.
We looked at that and considered heirarchy better.
What you lament is that we stopped with heirarchy. You wish for something more general, and that you have received. With hardlinks and ACL the filesystem is freed from its shackles of heirarchy. Most people don't need that most of the the time.
The strict heirarchy sits as the third step of a complexity ladder. The reason we stopped there, in most cases, is that it's enough complexity.
gobdovan 1 days ago [-]
You're overfitting on the filesystem example. My point is model lock-in.
Once a system commits to one hierarchy, it becomes very hard to introduce another authoritative model. Not only you cannot have alternative models with equal authority, you do not even get a clean parallel hierarchy later; you get the first hierarchy, now ossified, plus workarounds around it. On the other hand, webs are harder to reason about and more expensive to operate, but they can produce hierarchies 'on-demand' and you can get rid of them as easily as you've established them. Ergo, a ton of different programming languages just compile to LLVM IR.
speed_spread 2 days ago [-]
IMO hard links are underused in filesystems. You can have the same file / dir appear in different places under different names. Once linked, app doesn't have to care and runtime cost is zero.
PhilipRoman 2 days ago [-]
Hard links suffer from the general issue of there being two styles of writing to a file - open(2)/write(2) vs rename(2). Depending on the internals of each program you use to update the file, you will get very different results.
This is one of the ugliest parts of POSIX design, making idioms like -o /dev/null and file attributes unpredictable (I've had a server run out of disk space because a root-owned process used rename-style writing on /dev/null)
lstodd 2 days ago [-]
That was breakdown on a different level: your server process had no business of renaming stuff and it still did that. POSIX had nothing to do with this.
PhilipRoman 1 days ago [-]
I'd agree if POSIX provided something like an open() flag which made the changes visible atomically, but as it stands, the rename() idiom is the mainstream way of durable file writing, so it is commonly used. Practical example using busybox sed: (GNU sed detects this case and refuses to overwrite)
/ # stat /dev/null
File: /dev/null
Size: 0 Blocks: 0 IO Block: 4096 character special file
...
/ # sed -i 's/foo/bar/' /dev/null
/ # stat /dev/null
File: /dev/null
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
...
jmalicki 1 days ago [-]
It's the fundamental idiom of how to do atomic file replacement. The sever process had better be doing that over editing a text file in a way that could leave it invalid if the process OOMs mid edit, or another process is reading it while it's being written.
fellowniusmonk 1 days ago [-]
Here is my take using the city as a metaphor.
All physical things have to be arranged in a linear fashion, this is pauli exclusion principle writ large.
A type of mereological nihilism is true, so hierarchy is real but in a weak emergence sense.
When I am standing on a street corner in a city, I can walk into a building and then navigate into a room inside that building and pick up a piece of paper inside.
Or I can walk down the street and as I pass landmarks, I will pass things in order, this is also a perceived hierarchy, because my starting position and movement are bound by time, I will always have a fixed order of experiences from any x,y coordinate.
So everything bounded by time and space acts as an emergent hierarchy via movement for a specific loci.
The paper can be moved however, and that same piece of paper sitting in the drawer at an attorneys office means something very different than when its sitting in the approved licenses cabinet in an offical government building.
Semantic meaning is first class but its an entirely different dimension than other types of meaning and due to pointers it isn't bound by pauli exclusion.
When people start thinking about the semantics of objects in computer programs they often become confused, because the semantic representation of the object follows semantic ontology laws not physical object ontology.
People can fuck this up in either direction and think physical laws apply to semantic meaning or physical meaning applies to semantics.
gobdovan 23 hours ago [-]
Yeah, that's part of it. But I'd argue that once you look at a city, you look at constructions administratively designed by men, with their preference for hierarchies. Try to hierarchically describe to someone instructions to pick a random leaf in a dense tree (ironically, we use them as a metaphor for hierarchical organisation) and see how far you could go in the same way you could point them out to pick a paper in the city.
Not in a position to argue about physics, but I could think of quite a few things that are better described as web-like phenomena rather than hierarchical ones, electricity in a complex circuit comes to mind.
mcphage 2 days ago [-]
I thought the two hard problems were naming things, cache invalidation, and off-by-one errors?
rectang 2 days ago [-]
At least the title “The Third Hard Problem” is still appropriate regardless of whether you get the joke right.
fragmede 2 days ago [-]
Don't race forget conditions!
cheschire 2 days ago [-]
His message was submitted before the memory recall completed execution.
layer8 1 days ago [-]
They are. The article is about a third hard problem.
mcphage 10 hours ago [-]
Touché :-)
phazy 2 days ago [-]
John Ousterhout has also written about this in A Philosophy of Software Design:
The most fundamental problem in computer science is problem
decomposition: how to take a complex problem and divide it up into pieces that can be solved independently.
gw32 2 days ago [-]
Well elucidated. This problem has irked me for years in the form of multiple inheritance. When it's disallowed (like Java, unfortunately), trying to reduce a directed graph structure to a single dominant hierarchy is quite the bothersome choice.
guardiantesla 2 days ago [-]
I doubt there would exist a perfect taxonomy for everything. Taxonomies are subjective to individuals but somehow can be (ahem! AI) mapped relative to someone else’s preference taxonomy. What efficiency and meaning each individual (or organization or community) yields would be completely different.
ToniDoni 2 days ago [-]
I thought it was timezones.
iamwil 2 days ago [-]
I think I've always called this "Ontology is hard". It's genuinely useful when it's used as a tool for clarification. It's constraining when it's used as a tool for modeling.
naruhodo 2 days ago [-]
For me, the canonical example is organising images in folders vs tags.
2 days ago [-]
renox 1 days ago [-]
Is there an OS whose primary FS is a taggedFS?
It'd be interesting to hear from the users if this is nice to use or not..
js8 2 days ago [-]
Every few years I watch, with amusement, our management restructuring the organizational hierarchy, allegedly because the old one didn't work.
cgio 2 days ago [-]
Maybe allegedly so but in reality it worked once again, given there’s still management to be reorganised.
js8 2 days ago [-]
Well, tell that to my management, that it's a pointless endeavor. The structure of (required) social relations in the workplace is a graph; so there will always be some flaw when trying to conform this into a tree.
Svoka 2 days ago [-]
Putting object into trees is basically a caching problem.
recursivecaveat 2 days ago [-]
I was thinking it's a naming problem haha, a file path can be seen as a global/fully-qualified name really.
kator 2 days ago [-]
I wonder whether the author deliberately avoided ontology? That's what comes to mind when I read this. The age-old debate between taxonomy and ontology.
zephen 2 days ago [-]
The article veers from saying computers are different, to saying they should be different but maybe aren't, back to how special they are:
> The next time you sit down to an empty design doc and don’t know where to start, be kind to yourself. You’re solving a hard problem.
This supposed hard problem in computing has always been with us, in real life. Which he admits multiple times, e.g.:
> Yet Victorian-era gentlemen might have pondered the same questions while sorting letters as we do while sorting virtual paper.
He appears to claim that the sole organizing principle in real life is the hierarchy, but, of course, that computers and ideas are different:
> Hierarchies are so natural to us that they ... [work] for physical objects that can be in only one place at a time. Ideas and information, however, resist taxonomies. They form intricate webs that penetrate rigid boundaries.
This distinction of physical vs. virtual requirements doesn't hold up under any sort of rigorous analysis. As he admits, hierarchies are not always ideal in physical space -- do we organize parts and supplies separate from tools, or place them next to their probable job sites?
And of course, the "in only one place at a time" is certainly true for any given group of atoms, but we have become adept at making fungible copies of atoms for many things. I might have drywall screws or 33 ohm resistors in multiple cached locations, and I have soldering irons and screwdrivers and pliers on more than one workbench.
One thing that is true is that we can usually add non-hierarchical groupings to information more easily than we can to groupings of atoms.
Another thing that is true is that we already often do so whenever the convenience outweighs the various costs.
And the third thing that is true is that this, also, is not much different than the physical world, where we routinely both break our hierarchies and create copies of things when needed.
aleksiy123 2 days ago [-]
Use multiple trees.
rzzzt 1 days ago [-]
The tree node on the UI should not only expand downwards but sideways and in-out of the screen.
EGreg 1 days ago [-]
This an issue of growing ontologies and taxonomies.
I have found that the best answer I have is:
1) Gossip and use the existing ontology and add new items only if genuinely needed
2) Combine synonyms, sometimes using vector embedding and cosine similarity during search
The insight is that trees are not a storage constraint but a traversal consequence — any walk through a graph starting from a given node produces a tree as a byproduct, and different starting points or orderings produce different trees from the same underlying structure. Rather than forcing a choice of canonical hierarchy at write time (the dentist-bill-goes-in-which-folder problem), you store the graph and let the tree emerge lazily from context: whoever is querying, from wherever they enter, gets the tree that is relevant to them, and no connections are sacrificed to produce it.
In general, federated systems have this quality emerge. For example, in OpenStreetMap, one group can label Nagorno Karabakh, the other Artsakh, and no one has to fight Google etc.
jeffbee 2 days ago [-]
The first chapter of this waves away the fact that hierarchical filesystems are now useless, but it is still a fact. There is no more reason to organize your files than there is to drive around in a chariot. It is hard to map one domain to the other, but it is also not necessary. With AI indexing and recall it's less necessary than it has ever been.
g8oz 2 days ago [-]
This seems optimistic.
adampunk 2 days ago [-]
This is more true as stated than people want to give credit for, usually.
Any hierarchical taxonomy classifies on one dimension at each taxonomic level. Invariably someone wants to classify on one criteria when someone else wants to classify on another. Taxonomies that humans use aren’t multi-dimensional. So if there is a disagreement, someone wins and someone(s) has to lose.
No one is wrong; they just have different priorities or preferences or goals.
So now as an architect I never argue (and seldom discuss) taxonomies. I make two points and then bow out:
1. Whatever your taxonomy is, you need a rubric for each level. You need a procedure or set of questions that unambiguously map any $THING you encounter into exactly one bucket. Validate that competent people with no specific domain knowledge can properly classify things with your rubric; it must be repeatable by amateurs, not just experts (software is dumb).
2. Existence trumps theory. If there exists a taxonomy and rubric for what you’re classifying, you need to provide a $DARN_GOOD_REASON why this wheel needs reinventing. Personal preference and your 1% edge case probably don’t justify all the work to reinvent everything.
Then, I go back to the implementers and tell them to design in a tagging system, which is a DIY taxonomy, and except in ridiculous use cases, I can make indexes make it fast enough to let everyone overlay their own classification system.
This 100x! I wish this were more common.
The key property of a tree is that there a unique path (address) for each element, which is a useful property in the implementation layer. But forcing that on users is a horribly leaky abstraction.
Ideally separate the low-level implementation from the interface, and allow users their own way to address content. I imagine object storage (with UUIDs or whatever) is often good enough for the lower layer. For the interface layer, tags are an improvement on categories (tree structure), but I think there's also room for more innovation (fuzzy matching, AI-driven interfaces, etc) that start by allowing trading-off precision for recall but then allow regaining precision by adding more approximate qualifiers to the filtering.
----
PS: Pushing this approach to 11/10...
An intriguing (crazy?) application of this idea would be: what if we did this to the concept of a codebase? Make it a database (with all the corresponding improvements over a filesystem) -- it's no longer a tree of files, and allow users to query code like "that foo which accepts a bar, frobnicates its internal state, and emits a mutated baz". Note that this might also solve the "naming things" problem.
This setup seems like a powerful abstraction for AI coding agents. All that back-end power (database >> filesystem) is something they can easily leverage, and they can also be built to robustly resolve your fuzzy queries into precise addresses, and then update the code based on your desired outcome.
Tangential, but that reminds of the Haskell "hoogle" tool which allows searching for functions _by type_ across a large database of libraries, even by abstract types. So you might wonder "hmm what's that function that has a type structure like `t a -> (a -> t b) -> t b`?" and it'll happily tell you that it's monad `bind`
[Unison](https://www.unison-lang.org/docs/the-big-idea/) content addresses every definition. Kinda interesting.
A [Code Property Graph](https://en.wikipedia.org/wiki/Code_property_graph) takes a codebase and turns it into three graph representations: it's AST, a Control Flow Graph, and Program Dependence Graphs. These graphs are overlaid and shoved into a single property graph. It's a structure mainly used by some static analysis tools like [Joern](https://joern.io/)
---
This has been a topic of a lot of interest and research for me. I've been experimenting with figuring out a system inspired by these ideas, among others, to apply the same idea (shoving multiple graph representations together) to a broader set of information
I've got a legacy tag cloud curation tool for random collections (each collection gets its own id) of URLs. It IFRAMEs each URL to present it; no whining. I've used it for classifying technical docs, photo libraries (then I used the tags to train an image classifier), and to present an analysis of a customer's web site.
It's written in Perl, and (still) runs on modern Perl. Make friends and maybe I'll toss the code your way and help you with your project.
Younger me wondered: "Don't they have more important things to do? Why do they never delegate such a menial and boring task, especially when the structure is kind of obvious"
Today it makes total sense to me. Even if it looks obvious no one has the exact same hierarchy in mind. It was fast for them to materialize the hierarchy themselves than to convince anyone about it in every detail. Some things just can't be delegated.
And that's a problem because Aggregability is NP-Hard: https://dl.acm.org/doi/abs/10.1145/1165555.1165556
So a tree is a way to take a high dimensionality graph and make it usefully lower dimensionality, but, given the aforementioned proof, that reduction is going to go from being a lossless compression to a heuristic. So any interesting problem (at least, any problem interesting to me) is only going to be aided (read: not solved exhaustively) by that hierarchy.
I'm okay with this. Being okay with this has been one of the most freeing things over the last 20 years of my career. Accept inaccuracy, and find usefulness in your data structures.
Big insight in that article is also from https://matklad.github.io/2021/08/22/large-rust-workspaces.h... about structuring large rust workspaces as a flat list.
1. Naming things 2. Cache invalidation 3. off-by-one errors
<https://en.wikipedia.org/wiki/Computer_Lib/Dream_Machines>
<http://link.springer.com/10.1007/978-3-319-16925-5>
Are these two things actually the same thing, or they separate?
It's a great question, much deeper and more interesting than it seems. The essay suggests thinking in terms of isomorphisms (relative to the structure you care about) rather than equality in some absolute sense, and I've found a fuzzy version of that to be a really useful perspective even in areas that can't be fully formalized.
https://people.math.osu.edu/cogdell.1/6112-Mazur-www.pdf
Years ago, I stumbled upon the "idea" was already debated in other fields long before programming. Lumpers and Splitters.
https://en.wikipedia.org/wiki/Lumpers_and_splitters
https://interactivity.ucsd.edu/articles/In_Process/MultiTree...
The webs are so much more malleable, but they're also not free. All the 'good enough's you were that a hierarchy that was taking care of implicitly are now your responsibility to model precisely and make sure they're performant as well. Look at ReBAC, for example. It gives you expressive power, but it also forces you to reason precisely about relationships, graph traversal, consistency, and cost. Strikingly similar is GraphQL.
Interestingly, source code is hierarchical, but compiles almost immediately to a graph IR and most analysis and optimisations happen there. But almost nobody looks at a CFG/SSA graph directly. You author in a hierarchical manner, yet the operational substrate is a malleable graph.
But we could. Early filesystems didn't have heirachy. They'd sometimes support a single level of organization by treating a prefix of the filename as directory, but nothing below that. We looked at that and considered heirarchy better.
What you lament is that we stopped with heirarchy. You wish for something more general, and that you have received. With hardlinks and ACL the filesystem is freed from its shackles of heirarchy. Most people don't need that most of the the time.
The strict heirarchy sits as the third step of a complexity ladder. The reason we stopped there, in most cases, is that it's enough complexity.
Once a system commits to one hierarchy, it becomes very hard to introduce another authoritative model. Not only you cannot have alternative models with equal authority, you do not even get a clean parallel hierarchy later; you get the first hierarchy, now ossified, plus workarounds around it. On the other hand, webs are harder to reason about and more expensive to operate, but they can produce hierarchies 'on-demand' and you can get rid of them as easily as you've established them. Ergo, a ton of different programming languages just compile to LLVM IR.
This is one of the ugliest parts of POSIX design, making idioms like -o /dev/null and file attributes unpredictable (I've had a server run out of disk space because a root-owned process used rename-style writing on /dev/null)
All physical things have to be arranged in a linear fashion, this is pauli exclusion principle writ large.
A type of mereological nihilism is true, so hierarchy is real but in a weak emergence sense.
When I am standing on a street corner in a city, I can walk into a building and then navigate into a room inside that building and pick up a piece of paper inside.
Or I can walk down the street and as I pass landmarks, I will pass things in order, this is also a perceived hierarchy, because my starting position and movement are bound by time, I will always have a fixed order of experiences from any x,y coordinate.
So everything bounded by time and space acts as an emergent hierarchy via movement for a specific loci.
The paper can be moved however, and that same piece of paper sitting in the drawer at an attorneys office means something very different than when its sitting in the approved licenses cabinet in an offical government building.
Semantic meaning is first class but its an entirely different dimension than other types of meaning and due to pointers it isn't bound by pauli exclusion.
When people start thinking about the semantics of objects in computer programs they often become confused, because the semantic representation of the object follows semantic ontology laws not physical object ontology.
People can fuck this up in either direction and think physical laws apply to semantic meaning or physical meaning applies to semantics.
Not in a position to argue about physics, but I could think of quite a few things that are better described as web-like phenomena rather than hierarchical ones, electricity in a complex circuit comes to mind.
The most fundamental problem in computer science is problem decomposition: how to take a complex problem and divide it up into pieces that can be solved independently.
> The next time you sit down to an empty design doc and don’t know where to start, be kind to yourself. You’re solving a hard problem.
This supposed hard problem in computing has always been with us, in real life. Which he admits multiple times, e.g.:
> Yet Victorian-era gentlemen might have pondered the same questions while sorting letters as we do while sorting virtual paper.
He appears to claim that the sole organizing principle in real life is the hierarchy, but, of course, that computers and ideas are different:
> Hierarchies are so natural to us that they ... [work] for physical objects that can be in only one place at a time. Ideas and information, however, resist taxonomies. They form intricate webs that penetrate rigid boundaries.
This distinction of physical vs. virtual requirements doesn't hold up under any sort of rigorous analysis. As he admits, hierarchies are not always ideal in physical space -- do we organize parts and supplies separate from tools, or place them next to their probable job sites?
And of course, the "in only one place at a time" is certainly true for any given group of atoms, but we have become adept at making fungible copies of atoms for many things. I might have drywall screws or 33 ohm resistors in multiple cached locations, and I have soldering irons and screwdrivers and pliers on more than one workbench.
One thing that is true is that we can usually add non-hierarchical groupings to information more easily than we can to groupings of atoms.
Another thing that is true is that we already often do so whenever the convenience outweighs the various costs.
And the third thing that is true is that this, also, is not much different than the physical world, where we routinely both break our hierarchies and create copies of things when needed.
I have found that the best answer I have is:
1) Gossip and use the existing ontology and add new items only if genuinely needed
2) Combine synonyms, sometimes using vector embedding and cosine similarity during search
The insight is that trees are not a storage constraint but a traversal consequence — any walk through a graph starting from a given node produces a tree as a byproduct, and different starting points or orderings produce different trees from the same underlying structure. Rather than forcing a choice of canonical hierarchy at write time (the dentist-bill-goes-in-which-folder problem), you store the graph and let the tree emerge lazily from context: whoever is querying, from wherever they enter, gets the tree that is relevant to them, and no connections are sacrificed to produce it.
In general, federated systems have this quality emerge. For example, in OpenStreetMap, one group can label Nagorno Karabakh, the other Artsakh, and no one has to fight Google etc.