Knocean offers consulting and development services for science informatics. Our clients are scientists from academia, government, and industry who need to integrate and analyze their data in search of new discoveries. With our unique combination of technical, research, and communication expertise, Knocean helps our clients expand human knowledge.

Science Informatics

Scientists have always pushed the limits of technology in pursuit of knowledge. In the past it was the biggest particle accelerators, the most powerful telescopes, and the largest clinical trials that were at the forefront of scientific discovery. Today, the greatest obstacle to the next major breakthrough is not collecting more data, but discovering new knowledge in the oceans of existing data. Vast scientific databases only fulfil their potential when we can process that data effectively, link the databases together, analyze the details of our collective knowledge, and discover the laws and patterns that govern our world. Today's research in biology, medicine, physics, chemistry, and astronomy demands not only cutting-edge science, but cutting-edge knowledge management.

Science informatics is a large and growing field, offering many long-term opportunities. Knocean's strategy is to become best-in-class within a valuable niche market: biological and medical database integration using biomedical ontologies. Within this niche, demand for technical services and quality software far exceeds supply. We are establishing a reputation for excellence, providing consulting services and developing a network of clients and contacts. In the future we can build upon this experience to develop best-in-class software products, growing our market and creating an ongoing revenue stream. We expect this market niche to grow quickly, and Knocean will grow with it.

Database Integration Using Ontologies

Biomedical databases are growing in size, scope, and importance. Biomedical ontology projects, such as the Gene Ontology (GO), provide a foundation for integrating this data into wholes that are much greater than the sum of their parts. The principles behind these biomedical ontologies are deceptively simple: provide common terminology for everyone to use, organize it into a coherent system, and make it readable by both humans and machines. Putting these principles into practise requires a rare combination of expertise in standards-development and knowledge-management. But when the same terms are used to mean the same things across diverse databases, we can use automated tools to integrate that data on a large scale, and discover connections that we would never have seen before.

Example: The Immune Epitope Database

The Immune Epitope Database (IEDB), developed by the La Jolla Institute for Allergy and Immunology, collects the results of every paper ever published in allergy and immunology research. For seven years the IEDB has been pursuing better integration with other databases using biomedical ontologies. They have hired Knocean to take this process to the next level.

As the IEDB's expert curators process each paper, they enter information into the database using biomedical ontology terms for genetics, taxonomy, anatomy, diseases, chemistry, cell types, proteins, experiment types, geography, and more. Each ontology term has a universally unique identifier, and whenever that identifier is used in another database we can perform automated integration. Once the data is annotated in a standardize way, we use information about the relationships between the terms, encoded in the ontology, to perform large-scale automated reasoning. We enrich the existing data with new conclusions, creating new knowledge.

So far Knocean has worked with the IEDB to convert the 380,000 classifications in the gigantic National Center for Biotechnology (NCBI) Taxonomy into a smaller ontology of 10,000 terms. We maintain all the advantages of the larger ontology while being easier for immunologists to navigate and use. We have transformed the International Union of Immunological Societies Allergen Nomenclature into ontological for for use by the IEDB. We are helping the IEDB to create an easy-to-use ontology out of the 25,000 proteins in their database, by integrating data from UniProt, GenBank, IUIS, the Gene Ontology, and other sources. And we are working toward a full conversion of IEDB data into ontological form.

Profile of a Prospective Client

Knocean's prospective clients come from any branch of biological or medical research. Despite this diversity, they share many of the same needs as the IEDB. Scientific databases often use ad hoc terminology at first, but grow to use standardized terminology as they mature. Biomedical ontologies are designed for both humans and machines to read, and provide not only standardized terminology but also logical axioms for automated reasoning. Although ontologies are expensive to develop, the benefits to using an existing ontology are great, and so projects use existing ontologies where they can. However, since each project is unique, there is almost always a need to extend the existing ontologies in one direction or another.

Knocean offers consulting services to help scientists design their databases and their data-collection workflows to make best use of existing ontologies. We help scientists develop new terms and new ontologies when the existing ones do not fit their unique needs. We help link their data to other databases, and use automated reasoning to validate and enrich the data. And we develop software to automate all these processes, improving quality and increasing efficiency. The result is better scientific data, better organized for local use, linked to global resources, and ready to share with the world.

Knocean works closely with both the researchers and their IT support staff. With the researchers we assess the needs of the project, set goals, and develop ontologies and workflows for collecting, improving, and analyzing data. With the IT staff we work to implement these plans, adapting existing software when possible, and providing custom software when needed.

Many of the challenges faced by our clients are quite similar. As our experience grows we develop best-practises for overcoming those challenges, and software tools to support those best-practises. Whenever possible, we release our software under an open source license, which allows us to continue to extend and refine our tools project after project.

Organization

Knocean is a sole proprietorship, founded by Dr. James A. Overton, that began operations in the summer of 2012.

Dr. Overton holds a PhD in philosophy, specializing in philosophy of science and issues of scientific explanation. He also holds degrees with honours in mathematics and humanities. His work experience includes more than a decade of software development using semantic web tools and Internet technologies. Overton has been involved with biomedical ontology development since 2008, and has published on the use of OBO ontologies in radiology and in medicine more broadly. He is an active developer of the Ontology for Biomedical Investigations and contributes to other OBO projects such as the Basic Formal Ontology. His research has been funded by the Social Sciences and Humanities Research Council of Canada and the Rotman Institute of Philosophy, and he has been a visiting scholar in the United States and the Netherlands.

The combination of a background in philosophy with strong foundations in formal reasoning and computation provides significant advantages in our chosen niche. Our clients are already world-leaders in their scientific domains. Knocean complements their expertise with our own deep knowledge ontology tools, techniques, and practise.