With an innovation mindset, for us there's no day without discussing ideas or thinking about
problems to solve. Given the time, Unlost researches and innovates in various areas, including
angel investing, software and data products.
This page contains a growing number of notes, short and long, technical and high-level, on topics
and products we are working on. We look forward to feedback and are more than open to exchange and
collaborate on this.
Preseed Canvas — simplifying investment decisions
The Preseed Canvas is a structured way to assess potential and progress of early-stage startups. It aims
at all aspects relevant for a decisions beyond interaction with the team.
Preseed
Canvas: Whitepaper on the concept of Preseed Canvas and its criteria and scores
(updated: 15 Dec. 2024).
Canvas App
allows filling in 8 different canvas types including Preseed Canvas -- without sharing data with a
server (updated: 10 Jun. 2024).
Pitch2Canvas is an upcoming application of the Preseed Canvas to
perform automated analysis and scoring of pitch decks and other documents (updated: 15 Dec. 2024).
Motivation: We are seeing 100 decks for a single investment...
Semantic RAG with LLMs — towards a local knowledge engine
Current LLMs cannot reason, current "agentic" systems are often not reliable because of this, which led
to rollbacks of various high-profile AI initiatives. With semantic RAG we target cutting-edge functionality
that sets out to solve a few challenges with today's LLM and RAG systems, especially in business contexts:
Explainability and hallucination busting via a semantic reasoning layer
("neurosymbolic approach"), the notion of answer confidence, dynamic objective-based and interactive research
strategies, and various forms of annotations for organization, feedback and model improvement.
This project should replace previous work in Pitch2Canvas.
Ragster:
A modular, extensible, privacy-first system for conducting deep research using local or
remote LLMs and a number of RAG sources that is the basis of the semantic RAG approach.
Semi-formal languages (SFLs)
are a continuum of knowledge representations that range between unstructured data (including text token sequences) on one hand and formal languages (including ontologies) on the other by providing a flexible framework with varying degrees of formality and factual confidence.
We posit that SFLs can be a key building block for neurosymbolic systems that is easy to apply, both for extracting SFLs from unstructured data and for making LLM inference (more) explainable and add true deductive reasoning.
Motivation: LLMs have hit a wall in reliability and non-trivial deductive reasoning tasks.
A solution may build on Gregor's PhD work.
Creating Data Products — unlocking value through AI and semantics
Data products and data-driven products are products where data drives — or is — the core value
proposition. They transform raw information into actionable insights, automated decisions, or
intelligent services that create measurable business value. The rise of AI, particularly large language
models (LLMs), has dramatically expanded the possibilities for data products. Semantic relations between
data are the key to establishing consistency and predictable reasoning. We discuss implications,
opportunities and challenges beyond the hype.
Data Product Management:
Describes my view on data products and what it takes to create, develop and optimize them.
Establishes the role of AI and semantic processing in future data products.
From data to knowledge products:
Data are at the core of today’s processes. So is knowledge not captured in the data. We explore a
way how a new type of product can marry both.
A typology of AI and data products:
Proposes a pragmatic way of classifying AI and data products and
predicts implications on investment valuation.
Motivation: Gregor has worked 15 years on data and AI products. This is an application of the
Semantic RAG work above.
Pipeline processes, or short: pipelines, are multi-step decision processes that filter a set of candidate
items into a result set, given appropriate criteria. This captures the dynamics of complex decision
making under resource constraints, which is core to fields as wide as angel investing, recruiting, sales
(esp. B2B), product management and, as an extension, multi-agent systems. Analyzing today’s processes as
pipelines allows a structured approach to improve efficiency and effectiveness through optimizing the
balance and sequence of human and machine decisions, the decision criteria themselves, consistent
valuation of live pipelines and best-practice reuse between application areas.
Pipeline Processes — an overview: Introduces pipelines as a structure
and links it to the extensive but disparate work in various fields of application and research. The
result is a tool to help transform and optimize real-world pipelines with the right data and
the right decisions with the right AI augmentation (planned: early 2025).
Pipeline
Processes — a model: The article
gives an overview of a mathematical formulation and proposes a gradient-based solution to optimizing
pipelines (updated 22 Nov. 2024).
Sales Pipeline Analysis: A study analyzing a real-world sales funnel as a
pipeline process (planned: early 2025).
B2B Market Making: Shows how many B2B BD and sales pipelines are extremely
inefficient and proposes a solution based on an AI-based communication channel that directly links
into existing tools (planned: early 2025).
Motivation: A largely unrepresented area in AI and CRM where some innovative approaches could drive
significant improvements.
Project "Lakeshore" — from data mess to data mesh
Data are at the root of any digital or AI business. Unfortunately, making data sources ready for digital
business often turns out to be an organizational and technical nightmare. Data seem like the "dark
matter" of AI in that they exist but are very difficult to get a grip on. Project "Lakeshore" aims at a
knowledge graph-based platform that transforms scattered enterprise data on-premises and in SaaS systems
into a unified, searchable, and AI-ready resource. It can serve as a semantic enterprise search engine
and as a platform for AI and data management initiatives — in other words as an enabler for digital
transformation, especially for SME companies.
Lakeshore architecture: Overall concept and architectural setup for the
Lakeshore system centered around a knowledge graph as semantic overlay, a structured indexing
mechanism and a fine-grained access control mechanism.
Hybrid semantic indexing: Indexing data sources into semantic structures,
using probabilistic approaches, LLMs and ontology learning.
Hybrid semantic retrieval: Combining classic retrieval systems with
semantic and conversational approaches, using probabilistic, LLMs, RAG and knowledge graphs. Think
of it as an enterprise "perplexity.ai" with structured reasoning on top.
Privacy-preserving access control: Outlines the structure and algorithm
to ensure confidentiality for federated source systems, private access spheres and levels; considers
semi-private use cases like federated learning.
Data spaces and composite learning: Bringing together mutually
access-restricted data for federated access. This extends Federated Learning.
Motivation: Gregor has seen SaaS application sprawl in many organizations.
Integrating their information has grown in importance when data are needed to train and finetune AI.
Project "Unleash" — de-risking to drive innovation
Angel investment and Venture Capital portfolio returns are based on long-tail distributions: Outsized
returns of very few investments drive the success of the whole portfolio. For founders, there's only a
slim chance to become commercially successful. For this reason, many innovators do not become startup
founders and many investors choose less risky asset classes. Project "Unleash" aims at reducing risk for
both parties.
Pareto Unskewed: Analyzes the return distributions of venture portfolios
and discusses pool-based de-risking strategies. Proposes the concept of a "reverse insurance".
VC Continuation Fund: Proposes the concept of a de-risked funding
mechanism for angel networks that may make the asset class also more attractive for individuals
and family offices.
Motivation: We are interested in making startups more attractive -- both as a career
choice and as an asset class for small investors. This requires a deeper look at the risk profile
and potentially some ways to adjust it.
"How cool is ... ?!" — analyses and opinions
Discussing miscellaneous technical, business and leadership topics, this series highlights a couple of
insights and opinions.
"How cool are ...
"static" webapps?!": Client-only, "static" web apps are a very secure and scalable way of
delivering content. This note gives an overview of the pros and cons.
For many use cases, it is an option often overseen.
"How cool is ... the future of Product Management?!": We look at some
AI-induced developments across industries and posit that future PMs will have a widened role and be
much more central to business success.
"How cool are ... Causal Loop Diagrams?!": CLDs are a great way to
describe complex systems and strategies. We introduce the method and propose an AI-based way to get
the most out of them.
"How cool is ... cloud-first security?!": Doing cloud-first seems the
obvious thing for a new company. However, cybersecurity auditors tend to know shops centered around
on-prem. This note shows how to establish a good security posture without even an intranet.
"How cool are ... the first 100 days as a CTO?!": I did this journey
several times and there's a pattern I am happy to share here.