Relational Databases Aren’t
In the previous era, the relational database was the tool used for organizing data and information. Organizing all information in massive tables and rows provided the benefit of structure to the flood of data and information cascading into being. Creating complex queries makes it possible to quickly retrieve information to fill forms and results tables. In general, discovering how each individual data point is related to all others is left to human interpretation. Creating “join tables” enables some basic capability to connect discrete data points with each other, however they are brittle, add complexity, and do not adapt to changes, either in the underlying data, or the desired output.
While relational databases are good at storing, querying and accessing huge volumes of data, they are relational in name only: They are almost completely ignorant of the relationships connecting the data they hold.
I’m An Entity, You’re An Entity, And We’re Each Unique
The world is made of data sources. Every form, every transaction, every post/tweet/message, is stored in a data source somewhere. All those data sources vary. It is only by seeing all the data, from every disparate data source, connected, in a common format, in context, that we can truly “know” an entity.
Entity-based computing is the insistence on treating every person, place, thing, event or idea as a discrete entity, assigning them one or more “types,” and merging all known data about the entity into a unique representation of that entity. By using a unique identifier for each entity and connecting all known data to it, via named, bi-directional relationships, it is possible to get a holistic view of an entity. It also becomes possible to query the data store based on individual entities, entity types and relationships, to explore the many facets of each unique entity and the world in which they live.
Entities, and entity-types, participate in a Domain Ontology. Represented as a hierarchical tree, the set of types, super-types and sub-types define the entities that are important to understand in order to comprehend the domain. Entity type hierarchies can be extended to represent each entity at the most granular level of detail required, however by traveling back up the entity-type tree, it is possible to cluster or aggregate entities by the highest order type that makes understanding possible.
We’re All Inter-Connected – By Relationships
Developing expertise is a process comprised of:
- learning all the key “facts”,
- learning how the key facts are related, and
- increasing your depth of understanding of the relationships and their impacts on the facts.
Within every domain there are a set of relationships. These relationships define the important causal, attributal and procedural relationships between all the entities that matter within the domain. As the nature of the domain, and our understanding of it evolves over time, the relationships change and evolve as well. These relationships may be symmetric, or asymmetric, and many different relationships may connect any two entities. Understanding the relationships allow us to describe the domain in a common language that enables consistent representation over time and across data sources.
Relationships, and relation-types, participate in a Domain Ontology. Represented as a hierarchical tree, the set of relationships and their sub-relationships enable us to describe the connections between data points at the most granular level of detail required, and yet to travel back up the tree to describe groups of related entities by the highest level of relationship aggregation that makes the most sense.
So, Ontologically Speaking, Who Are You?
Entity type hierarchies, combined with the relationship hierarchies, define the language of the domain. It is desirable to further describe each entity by defining the set of relationships that best describe its relation to every other entity in the domain. This unit of an entity and the set of the relationships that define its relationship to other entities within the domain is an Entity-Type Template. This template can be used any time a new entity of a particular type is discovered. By applying the template, and filling the gaps in the related entities that surround it, it is possible to quickly and consistently describe each new entity. By automating the process of filling template gaps from a variety of data sources, it is possible to quickly build up a common representation of everything we know about an entity and it’s connections within the domain. By iterating this process across all the entities within the domain, it is possible to quickly develop a complex network describing the domain. The superset of Entity Type and Relation Type Hierarchies, combined with the Entity Type Templates, defines the Semantic Domain Ontology. This ontology can be shared among users and used as a tool to compare data from sources outside the domain ontology.
Entities, Relations, Ontologies – Its All Just Semantics!
The human mind is the ultimate sense-making machine. It stores information about every fact it has ever consumed in a grey-matter cloud in which all the facts are just waiting to be stitched together, depending on context, into understanding. While our understanding of this process is incomplete, and ever evolving, the best model for externalizing this representation of interconnected facts and relationships is the semantic network. Semantic networks, from the domains of cognitive psychology and artificial intelligence, are a simplified, externalized representation of the way our minds work. At their most elemental, semantic networks consist of entities (or concepts), and relations, and the unit formed by combining an entity, via a relationship, with another related entity, which we call a Triplet. Each entity can participate in an infinite number of triplets.
Each entity in a semantic network can be represented as a node, and decorated in a variety of ways. By using bi-directional relationships, as labels on the ray connecting each entity node, it is possible to see, at a glance, everything we know about a given entity, in context. By iterating triplets over and over, it is possible to build infinitely large graph-like structures of knowledge and to traverse the network from entity to entity via explicitly named relationships, always maintaining contextual awareness.
These graph-like structures of knowledge are amenable to computation, manual annotation, and path-based analysis or exploration. Within a clearly understood domain of knowledge, they become an externalized representation of the subject matter experts’ mental models of the domain. They can be shared, explored, adapted, merged with other networks and used as the basis for back-end systems that seek to exploit the unambiguous relationships within the knowledge network.
Great Theory, But Will It Work In The Real World?
Since 2001, Semantic Research, Inc. has been using NetCentric Intelligence® to drive the next generation of breakthroughs in intelligence analysis with the Department of Defense, the Intelligence Community, and Law Enforcement Agencies. Semantica®, the NetCentric Intelligence platform, has been used by counter drug, counter terrorism, judicial protection, cyber, counter threat finance, anti-piracy and many, many other programs. Programs using Semantica as their analytical backbone have been responsible for protecting the U.S. from terrorist threats, from transnational criminal organizations, countering improvised explosive devices, supporting troops in battle and targeting our enemies near and far. Our programs are recognized as some of the most innovative, cost-effective and successful at stopping bad things from happening.
Having been tested, and battle-hardened, in the crucible of the defense and intelligence communities, we are confident that this technology can be scaled to meet the challenges of any domain in which there is value to be had from fusing data from multiple different data sources, mapping it into a common language for understanding, representing it in a common structured data format for multiple graph-based computational manipulations, and discovery of relationships and patterns that can be exploited for protection or profit.