Category: First person technology (Page 1 of 4)

Toward a Personal AI Roadmap for VRM

June 18, 2025 / Doc Searls / 4 Comments

On the ProjectVRM list, John Wunderlich shared a find that makes clear how advanced and widespread AI-based shopping recommendation has gone so far (and not just with ChatGPT and Amazon). Here it is: Envisioning Recommendations on an LLM-Based Agent Platform: Can LLM-based agents take recommender systems to the next level?

It’s by Jizhi Zhang, Keqin Bao, Wenjie Wang, Yang Zhang, Wentao Shi, Wanhong Xu, Fuli Feng, and Tat-Seng Chua* and is published in the Artificial Intelligence and Machine Learning section of Research and Advances in Communications of the ACM. So it’s serious stuff.

Here’s one graphic from the piece, with its caption:

Illustration of the Rec4Agentverse. The left side depicts three roles in the RecAgentverse: the user, the Agent Recommender, and Item Agents, along with their interconnected relationships. In contrast to traditional recommender systems, the Rec4Agentverse has more intimate relationships among the three roles. For instance, there are multi-round interactions between 1) users and Item Agents and 2) the Agent Recommender and Item Agents. The right side demonstrates how the Agent Recommender can collaborate with Item Agents to affect the information flow of users and offer personalized information services.

With VRM, however, we need our own personal AI agents, and useful data that’s not limited to what’s gathered by surveillance (which we can assume is most if not all of the Information layer).

So I started a conversation with ChatGPT. I’ll divide it into sections:

Who wrote the CACM piece
The state of recommendation science
Who is working on personal AI
How one can get started
Forms of personal data to manage

There’s a lot here. The meat of it, for ProjectVRM purposes, starts in section 3. (The first two are more about what we’re up against.) Please read the whole thing and help us think through where to go with the challenge facing us. As we see in section 3, we do have some stuff on our side.

1. Who wrote the CACM piece

Me:

Can you tell me more about the authors here: https://cacm.acm.org/research/envisioning-recommendations-on-an-llm-based-agent-platform/?

ChatGPT:

🧑‍💼 Jizhi Zhang

Affiliation: University of Science and Technology of China (USTC), Data Science Lab (inferred from USTC page) cacm.acm.org+14data-science.ustc.edu.cn+14x.com+14.
Research Interests: Recommendation systems, LLM-based agent platforms, and interactive intelligent systems—his lead authorship and continued work on Prospect Personalized Recommendation on LLM-Based Agent Platform (arXiv Feb 2024) reinforce this arxiv.org.

🧑‍🔬 Keqin Bao, Wenjie Wang, Yang Zhang, Wentao Shi, Fuli Feng, Tat‑Seng Chua

Affiliations: While CACM doesn’t list this, Jizhi Zhang and Fuli Feng are part of USTC’s Data Science Lab; Tat‑Seng Chua is a distinguished researcher from the National University of Singapore known for multimedia and information retrieval.
Expertise: A mix of recommender system design, LLM-agent evaluation, multimedia analysis, and human-AI interaction—reflecting a blend of USTC’s strengths and CACM relevance data-science.ustc.edu.cn+8cacm.acm.org+8data-science.ustc.edu.cn+8linkedin.com+4arxiv.org+4researchgate.net+4.

🧑‍🎓 Wanhong Xu

Affiliation: USTC Data Science Lab (USTC LDS), contributor to the same article .
Role: Emerging researcher whose latest publication is this Rec4Agentverse article, indicating an early-career focus in LLM-based recommender systems linkedin.com+8data-science.ustc.edu.cn+8linkedin.com+8.

🧠 Broader Context & Contributions

The group operates at the intersection of Large Language Models, agent-oriented systems, and recommender systems, pioneering a Rec4Agentverse paradigm that treats LLM-powered agents as “recommendable” items collaborating with one another—an evolution of traditional recommendation frameworkscacm.acm.org+5cacm.acm.org+5cacm.acm.org+5.
Their work continues earlier research (e.g., Prospect Personalized Recommendation… in Feb 2024), solidifying their ongoing commitment to expanding intelligent, conversational recommendation platforms arxiv.org.

🧭 Why This Matters

Academic Thought Leadership: Jizhi Zhang and colleagues are pushing forward thinking about how recommendations can evolve when agents—not passive items—are central.
Institutional Backing: All deeply connected to USTC’s Data Science Lab, underscoring strong Chinese academic investment in AI-driven recommender innovation.
Collaborative Reach: Including scholars like Tat‑Seng Chua broadens the research to include international perspectives in multimedia and retrieval systems.

In summary: This team is a multidisciplinary cohort—centered at USTC with international collaboration—who are pioneering the idea that LLM agents can become interactive, dynamic items in recommender systems. Their work positions them at the forefront of intelligent agent systems and could reshape how platforms personalize experiences in the future.

If you’re curious about any individual’s past publications, citation impact, or related work (e.g., on Agent4Rec), I’d be happy to dive deeper!

3. The state of recommendation science

Me:

How large is the emerging science, within the AI field, of recommendations?

ChatGPT:

The science of recommendations, particularly within the AI field, is not just emerging—it’s a massive, rapidly evolving subfield that is both academically rich and commercially dominant. That said, it’s also entering a new phase, driven by the rise of large language models (LLMs), interactive agents, and multimodal platforms.

📚 Academic Scope

1. Established Field: Recommender Systems

Recommender systems (RecSys) are a well-developed domain within AI and machine learning, with roots in the 1990s.
Annual conferences like:
ACM RecSys (since 2007)
SIGIR, KDD, WSDM, and NeurIPS regularly feature recommendation research.
Thousands of papers annually on topics like:
Collaborative filtering
Matrix/tensor factorization
Deep learning for recommendation
Fairness, transparency, and explainability in RecSys
Reinforcement learning and bandit algorithms for personalization

🔎 As of 2024, over 60,000 papers in academic databases (e.g., Google Scholar) are tagged or indexed under “recommender systems”.

📈 Current Trends & Expanding Frontiers

2. LLMs + Recommendation = New Subfield

A major shift is underway: integrating Large Language Models (LLMs) into recommendation systems.
These go beyond “you may also like” to:
Use natural language context and dialogue
Generate explanations, not just rankings
Support conversational recommendation
Coordinate agent ecosystems, e.g., multi-agent reasoning environments

Example paradigms:

RecMind – reasoning-aware recommendation agents
Rec4Agentverse – treating agents as recommendable entities
ChatRecs – integrating LLMs into live user conversations

3. Multimodal & Multitask Integration

Systems increasingly blend text, image, video, behavior, and user feedback.
LLMs allow richer personalization using complex user intent derived from unstructured data.

4. Industry Investment

FAANG companies (especially YouTube, Netflix, Amazon, TikTok, and Spotify) base their core user experience on AI-driven recommendation.
Billions of dollars are spent optimizing:
Ranking quality
Engagement prediction
Real-time inference
Federated and privacy-preserving personalization

🧭 Estimate of Scope

Area

Estimate (2024–25)

📄 Academic papers (RecSys + AI)

60,000+ overall; thousands per year

📚 Top conferences with recommender tracks

20+ (ACM RecSys, SIGIR, KDD, NeurIPS, etc.)

👩‍💼 Active researchers

Tens of thousands globally

🧠 Emerging subfields

LLM-based Rec, conversational Rec, agent-based Rec, explainable Rec

💰 Industry scale

$10B+ annually invested in personalization tech

🧠 In Summary

The science of recommendation is:

Not new, but rapidly evolving
Core to AI applications, especially with LLMs and agents
One of the most commercially and socially impactful domains in modern AI
Growing into new subfields (e.g. Rec4Agentverse, LLM Rec, multi-agent Rec)

Let me know if you’d like a visual map of how the recommendation science ecosystem is structured—or who the top authors and institutions are in this space

Estimate (2024–25)
📄 Academic papers (RecSys + AI)	60,000+ overall; thousands per year
📚 Top conferences with recommender tracks	20+ (ACM RecSys, SIGIR, KDD, NeurIPS, etc.)
👩‍💼 Active researchers	Tens of thousands globally
🧠 Emerging subfields	LLM-based Rec, conversational Rec, agent-based Rec, explainable Rec
💰 Industry scale	$10B+ annually invested in personalization tech

3. Who is working on personal AI

Me:

And who, if anyone, is working to provide customers with their own personal AI tools: ones that work for them, to empower their own agency in the marketplace? By which I mean truly personal AI, and not the kind personalized by some external LLM (or similar) system.

ChatGPT:

That’s an essential and forward-looking question—especially as most “AI for you” today really means “AI about you, for someone else.”

While Big Tech firms dominate personalized AI, a growing ecosystem is focused on truly personal AI—that is, AI you own, that works for you (not advertisers or platforms), and that enhances your agency in the marketplace and your life.

Here are the most notable people, projects, and initiatives advancing truly personal AI today:

🧠 1. Kwaai and pAI-OS (South Africa / Global)

Kwaai is building an open-source, user-sovereign personal AI OS called pAI-OS.
Designed to run locally or at user discretion, with full data ownership, modular agent systems, and plugin capability.
Aligns with IEEE P7012 (“MyTerms”) privacy standards (see below).
Actively exploring intentcasting, customer signaling, and VRM (Vendor Relationship Management).
Site: https://kwaai.ai

🛠️ 2. Indie AI and Personal Agent projects

Several smaller open-source initiatives are trying to build personal agent toolkits that you can host and configure:

Examples:

AutoGen Studio (Microsoft) – allows building agent workflows, some of which can be user-defined. But it’s not yet fully personal or private.
Langroid – open-source framework for local agent-based LLM systems. Designed for extensibility, can be made personal with effort.
OpenDevin, Superagent, Auto-GPT – can be self-hosted, though most are still tool-centric rather than user-centric.
Personal.ai – A commercial startup that says it is personal, but uses centralized storage and fine-tuning, so actual user control is limited.

🔍 Reality check: Most of these still rely on external LLMs (like OpenAI or Anthropic), even if they claim to be “your AI”.

🌐 3. Solid Project (by Tim Berners-Lee)

While not AI per se, Solid is a decentralized identity and data storage architecture that allows individuals to store their own data in personal data pods.
AI tools can be layered on top to access only the data you choose to share.
Foundation for personal AI assistants that can work with your data, not on others’ clouds.
https://solidproject.org

📜 4. IEEE P7012 – MyTerms

A standard being developed to flip the notice-and-consent regime—so individuals, not companies, dictate the terms of data use.
Backed by Customer Commons and designed to support intentcasting and personal marketplace signaling.
This enables a personal AI to serve as your agent, enforcing your terms and engaging the marketplace on your behalf.

🧩 5. Project VRM (Vendor Relationship Management)

Initiated by Doc Searls to empower customers with tools to manage vendors, not just the reverse.
Envisions personal tools (including AI) to:
Signal intent
Manage service contracts
Retain records of offers, purchases, warranties
Avoid manipulation, surveillance, and lock-in

Project VRM is not a software project per se, but a framework guiding developers to build personal tools that support individual autonomy.

🔒 6. Open-source LLM stacks for local use

Several local LLM projects can be combined with personal data layers to build truly personal AI:

LM Studio – runs local models on desktop
PrivateGPT – local inference + local documents
Ollama – macOS-native interface for fast-running models like LLaMA, Mistral
LangChain / LlamaIndex – frameworks for building RAG systems with local documents

These aren’t personal AI systems yet—but they are key ingredients.

🧭 Summary

Name / Project

What They Do

Status

Kwaai / pAI-OS

Open, personal AI OS and agent platform

Early but active

Solid

Personal data control layer

Stable; in use

IEEE P7012

Standards for user-enforced data terms

In development

Project VRM

Framework for tools that empower buyers

Thought leadership; toolkits emerging

Local AI frameworks

Tech foundations for personal AI

Available, DIY-friendly

🚧 The Gap

Despite this promising landscape, no major platform today delivers a fully integrated, easy-to-use personal AI that you own, configure, and direct—yet. But:

A personal AI revolution is underway, led by privacy advocates, open-source communities, and a few visionary developers—not Big Tech.

Would you like help putting together a DIY stack for your own personal AI toolkit based on this landscape?

What They Do	Status
Kwaai / pAI-OS	Open, personal AI OS and agent platform	Early but active
Solid	Personal data control layer	Stable; in use
IEEE P7012	Standards for user-enforced data terms	In development
Project VRM	Framework for tools that empower buyers	Thought leadership; toolkits emerging
Local AI frameworks	Tech foundations for personal AI	Available, DIY-friendly

Me:

Can you name some of the people in the personal AI revolution, and tell me what that DIY stack would contain?

ChatGPT:

Absolutely. Here’s a breakdown of key people involved in the personal AI movement, followed by a practical DIY stack you can start building today.

👤 People Leading or Advancing the Personal AI Revolution

These individuals aren’t all working on the same projects, but they’re aligned in trying to build AI that’s owned, run, and controlled by the individual—not centralized platforms.

🧑‍🔬 Tim Berners-Lee

Project: Solid (stewarded by ODI, commercially implemented by companies including TBLs Inrupt)
Focus: Decentralized web and personal data ownership via data pods
Impact: Infrastructure for privacy-respecting personal AI agents
https://solidproject.org

👤 Doc Searls

Project: Project VRM, Customer Commons, IEEE P7012
Focus: Customer empowerment, intentcasting, reversing surveillance capitalism
Impact: Philosophy and standards for customer-directed personal AI
https://projectvrm.org

🧠 Tawanda Kembo & Kwaai team

Project: pAI-OS (personal AI operating system)
Focus: Agent-centric open-source operating system for personal AI
Impact: One of the few efforts aiming to create a complete personal AI stack
https://kwaai.ai

🧑‍💻 Simon Willison

Project: Datasette, open tools for data management
Focus: Personal data control, lightweight databases, LLM + local tooling
Impact: Key figure in the open data + local AI tooling space
https://simonwillison.net

🤖 Brian Roemmele

Project: Multiplex (personal voice AI)
Focus: Voice-first personal assistants you run locally
Impact: Vocal advocate for true personal AI, often referencing “software you own”

🛠️ A DIY Personal AI Stack (2025 Edition)

This stack assumes local-first, modular, privacy-respecting tools. You can mix and match based on your needs.

🧩 Core Components

Layer

Tools / Projects

Notes

1. OS Layer

Linux, pAI-OS (in dev)

Linux gives flexibility; pAI-OS aims to be a full personal AI OS

2. Local LLM Inference

Ollama, LM Studio, PrivateGPT, LocalAI

Run models like LLaMA 3, Mistral, Phi-3 entirely offline

3. Vector Database

Chroma, Weaviate, Milvus, or SQLite with FTS5

Store and retrieve embedded text data

4. Personal Indexing

LlamaIndex, LangChain, AutoGen, GPTPilot

Feed personal documents and data to your LLM

5. Personal Data Storage

Solid POD, Datasette, Obsidian

Your life’s data, structured and queryable

6. Personal Agent Framework

Superagent, CrewAI, Langroid, AutoGen

Design agents that act on your behalf, modular and scriptable

7. Front-end Interface

Gradio, Streamlit, Tauri app, Electron

Local GUI/UX to interact with your agent

8. Voice Interface (optional)

OpenVoice, Vosk, Whisper.cpp, Coqui TTS

Build voice-in/voice-out agent on your machine

🧠 Optional Add-ons

Offline search: Chat over your files, PDFs, photos, and emails with LlamaIndex + PrivateGPT
Calendar/notes: Sync with Obsidian or local markdown-based knowledge base
Browser agent: Auto-control browser via Puppeteer/Playwright for marketplace navigation

🧭 What’s Still Missing?

True personal OS integration (like what pAI-OS is trying to do)
Seamless data portability and contract enforcement (IEEE P7012 is working on it)
Trusted personal AI app ecosystem (some apps, like Beeper or Obsidian, are getting close)

Would you like a tailored recommendation for building a personal AI stack based on your technical skill level or use cases (e.g., journaling, research, negotiation, shopping, data querying)?

Tools / Projects	Notes
1. OS Layer	Linux, pAI-OS (in dev)	Linux gives flexibility; pAI-OS aims to be a full personal AI OS
2. Local LLM Inference	Ollama, LM Studio, PrivateGPT, LocalAI	Run models like LLaMA 3, Mistral, Phi-3 entirely offline
3. Vector Database	Chroma, Weaviate, Milvus, or SQLite with FTS5	Store and retrieve embedded text data
4. Personal Indexing	LlamaIndex, LangChain, AutoGen, GPTPilot	Feed personal documents and data to your LLM
5. Personal Data Storage	Solid POD, Datasette, Obsidian	Your life’s data, structured and queryable
6. Personal Agent Framework	Superagent, CrewAI, Langroid, AutoGen	Design agents that act on your behalf, modular and scriptable
7. Front-end Interface	Gradio, Streamlit, Tauri app, Electron	Local GUI/UX to interact with your agent
8. Voice Interface (optional)	OpenVoice, Vosk, Whisper.cpp, Coqui TTS	Build voice-in/voice-out agent on your machine

4. How can I get started?

Me:

Yes. My technical skill level is modest at best. I can write HTML and do very basic stuff in the command line. I have an extra laptop I can devote to solo work, and Ethernet to a 2.1GB/s symmetrical fiber connection to the Internet with no data limits.

ChatGPT:

Great — with your setup (modest technical skills, spare laptop, blazing-fast fiber), you’re in an excellent position to begin building a practical personal AI system without diving too deep into complex dev work. Here’s a recommended stack tailored to your current skills and goals:

🧰 YOUR PERSONAL AI STARTER STACK (2025)

🔧 1. Platform Base

Use a Linux distribution designed for ease of use and AI tinkering:

Recommended: Pop!_OS or Ubuntu 24.04
Easy GUI, good hardware compatibility, lots of online support
Optional: Use your laptop in “air-gapped” mode for privacy and full local control

🤖 2. Run a Local LLM with LM Studio

LM Studio (macOS/Windows/Linux) lets you run LLaMA 3, Mistral, Phi-3, and others without coding.
Built-in UI — chat with models locally, no internet required after download.
You can download 7B or 13B models based on your hardware.
Best for: private journaling, summarizing files, answering questions.

✅ Why it’s ideal for you:

GUI-based
Just click and download models
No API keys or command-line needed once installed

📁 3. Chat with Your Files (PrivateGPT / LlamaIndex)

PrivateGPT or GPTPilot can let you ask questions across PDFs, emails, and notes stored on your machine.
Start simple: drop files into a folder, let the AI index them.
Optionally install via Docker (one command), or use their prebuilt apps.

✅ Use case:

“Summarize this research PDF”
“What did I write about project X last month?”
“Find the email where someone mentioned a refund”

📚 4. Knowledge Base + Journaling with Obsidian

Obsidian = Markdown-based knowledge base.
Everything you write stays local.
You can point LM Studio to your notes to query your own memory.

✅ Easy way to:

Keep private journals
Index your life
Ask your AI things like “What are my goals for June?”

🗂️ 5. Personal Data Management with Datasette

Datasette by Simon Willison lets you browse and query your own data with simple SQL and CSV files.
Feeds AI with structured personal info (purchases, inventory, plans, etc.)
You can install it with a single terminal command:
bash
pip install datasette datasette serve your-data.dbdatasette serve your-data.db

✅ Combine with LM Studio to get responses like:

“What subscriptions do I need to cancel?”
“When did I last back up my photos?”

🌐 6. Optional: Browser Helper Agent (Local, Simple)

Use AutoBrowser or a simpler Puppeteer script to automate browsing tasks (like checking prices, news, account activity).
Start with prebuilt scripts. No coding needed if you copy/paste working examples.

🔒 7. Privacy Boosters

Run Tailscale or WireGuard to access your AI remotely if needed.
Use LocalAI or Ollama if you later want to run everything without GUI (more advanced).

🧭 Putting It All Together

Task

Tool

Chat privately with an AI

LM Studio

Summarize your own notes or PDFs

PrivateGPT or GPTPilot

Store and reflect on your life data

Obsidian, Datasette

Manage subscriptions and tasks

Datasette + Chat

Ask about your files or history

LlamaIndex or GPT over Obsidian notes

Use voice later (optional)

Whisper.cpp, OpenVoice (easy tools)

🪜 Next Steps (in plain order)

Install Pop!_OS or Ubuntu on your spare laptop
Install LM Studio and download a 7B model like Mistral or LLaMA 3
Start using it to answer personal questions and summarize documents
Add Obsidian to start journaling and reflecting
Try PrivateGPT or Datasette to search your past or track your life

Would you like a checklist or an install script to get started with this stack? I can also help you set up a file structure to begin tracking and querying your own data.

Tool
Chat privately with an AI	LM Studio
Summarize your own notes or PDFs	PrivateGPT or GPTPilot
Store and reflect on your life data	Obsidian, Datasette
Manage subscriptions and tasks	Datasette + Chat
Ask about your files or history	LlamaIndex or GPT over Obsidian notes
Use voice later (optional)	Whisper.cpp, OpenVoice (easy tools)

5. Forms of personal data to manage

Me:

The personal data I need to manage are health records (from 20-plus providers), financial records (fewer than health), travel records (including the kind kept by apps that watch my movements), subscriptions and other recurring payments, shows I have watched and on what channels or services (data gathered by my Samsung and Roku TVs and presumably shared with others but to which I don’t yet have access), continuing work and work history with various clients and employers, contacts, archival and current calendars, real estate, personal items bought or acquired from various sources and where they are kept or stored, and business dealings. What would be the file structure for all that, and how would I manage it? Note that I am not talking about shopping yet; that’s a separate concern.

ChatGPT:

Thanks for the detailed breakdown — you’re describing the foundation of a true personal data architecture: structured, queryable, and under your control. Let’s build you a file structure and management strategy that matches your needs without overcomplicating the tech.

📁 Recommended Top-Level Folder Structure

This folder structure is designed to be future-proof, machine-readable, and human-usable. Use plain folders, CSV/JSON/Markdown files, and optional SQLite databases to store structured data. You can later layer tools (like LLMs or Datasette) on top.

markdown

/PersonalAI_Data

│

├── Health/

│   ├── Providers/

│   ├── Visits/

│   ├── Prescriptions/

│   ├── Labs_Imaging/

│   └── Insurance_Claims/

│

├── Finance/

│   ├── Bank_Accounts/

│   ├── Credit_Cards/

│   ├── Investments/

│   └── Taxes/

│

├── Travel/

│   ├── Trips/

│   ├── Locations_History/

│   └── Maps_GPX_KML/

│

├── Subscriptions/

│   ├── Streaming/

│   ├── News_Publications/

│   ├── Software_Tools/

│   └── Utilities_Bills/

│

├── Media_Viewed/

│   ├── TV_Shows/

│   ├── Movies/

│   ├── Channels_Services/

│   └── Watch_History.csv

│

├── Work/

│   ├── Employers/

│   ├── Clients/

│   ├── Projects/

│   └── Timesheets_Invoices/

│

├── Contacts/

│   └── contacts.csv

│

├── Calendar/

│   ├── Archive_ICS/

│   ├── Current_ICS/

│   └── Events_Tags.csv

│

├── Real_Estate/

│   ├── Properties_Owned/

│   ├── Rentals/

│   └── Maintenance_Records/

│

├── Personal_Items/

│   ├── Furniture/

│   ├── Electronics/

│   ├── Art_Collectibles/

│   └── Inventory.csv

│

└── Business/

    ├── Deals/

├── Agreements/

├── Partners/

└── Notes/

🧠 Strategy for Managing It All

1. Use Consistent File Types

Data Type	File Format
Structured entries	`.csv`, `.json`, `.sqlite`
Notes or logs	`.md` (markdown)
Documents	`.pdf`, `.docx`, `.txt`
Images/attachments	store in `attachments/` subfolder per topic

2. Naming Conventions (Machine + Human Friendly)

Use ISO-like formats to make things sortable and parseable:

3. Data Capture Tools

You don’t need to build it all manually:

Use Obsidian for ongoing notes (e.g. travel journal, work logs)
Use Paperless-ngx (optional) to OCR and tag your scanned PDFs (health, finance)
Use GPS History Export from Google and Apple for travel/locations
Use email filters or Zapier to auto-download bills, statements, confirmations

4. Querying and Reviewing

Add one or more of these later:

Datasette → browse and search CSVs/SQLite locally
LM Studio + PrivateGPT → ask, “When did I last visit Dr. Chen?” or “What devices are in storage?”
LlamaIndex → index your file tree and let AI search/summarize it

✅ Summary of Benefits

Goal	This Setup Delivers
Total local control	You decide access, storage, retention
Future AI-readiness	Structured enough for agents to use
Easy to back up	Plain files, portable databases
Modular growth	Start simple, layer in AI when ready

Would you like me to generate a starter template with folder structure and some CSV/Markdown file formats filled in with mock data for each domain? That way you can load it directly into your system and start customizing.

Me again:

That’s it for now.

What should my next request be, or what other direction would you like me (or us) to take this?

Or, what are the learnings we have so far in this field that barely exists?

Oh, and excuse whatever inaccuracies ChatGPT may have floated above. Or give me fixes.

Thanks.

*The links are mine and based on searches for those names. I might have some of them wrong, however, since searches for those names bring up many scholars with the same names. And, while each name had a link, they all went to an author archive page that was absent (with one exception) of other contributions to the publication.

Toward better buy ways

August 23, 2022 / Doc Searls / 0 Comments

For sixteen years, ProjectVRM has encouraged the development of tools and services that solve business problems from the customer side. This work is toward testing a theory: that free customers are more valuable—to themselves and to the businesses they engage—than captive ones. That theory can only be tested when tools for doing that are in place.

We already have some of those tools. Our big four in the digital world are the browser, the phone, email, and texting. In the analog offline world, our best model is cash. From The Cash Model of Customer Experience:

Here’s the handy thing about cash: it gives customers scale. It does that by working the same way for everybody, everywhere it’s accepted. It’s also anonymous by nature, meaning it carries no personal identifiers. Recording what happens with it is also optional, because using it doesn’t require an entry in a ledger (as happens with cryptocurrencies). Cash has also been working this way for thousands of years. But we almost never talk about our “experience” with cash, because we don’t need to.

The problem with our four personal digital tools—browser, phone, email and texting—is that they are not fully ours. So our agency is at best compromised. Specifically,

The most popular browsers are also agents of Apple, Google, Microsoft, plus countless thousands of third parties inserting cookies and other tracking instruments into our devices.
Our phones are not just ours. They are corporate tentacles of Apple and Google, lined with countless personal data suction cups from unknown surveillance systems. (For more on this, see Apple vs (or plus) Adtech, Part I and Part II.)
Apple and Google together supply 87% of all email software and services. Apple promises privacy, while Google makes a business out of knowing the contents of your messages, plus every other Google-provided or -involved piece of software reveals to the company about your life. As for how well Apple delivers on its privacy promises, look up apple+compromised+privacy.
The original messaging service for phones, SMS, is owned and run by phone companies. Other major messaging, texting and chat services are run entirely by private companies.
Among common Internet activities, only email and browsing are based on open and simple standards. The main ones are SMTP, IMAP, and POP3 for email, and HTTP/S for browsing. Those share the Internet’s three NEA virtues: Nobody owns them, Everybody can use them, and Anybody can improve them.

This is important: If a product or service mostly works for some company, it’s not yours. You are a user or a consumer. You are not a customer; nor are you operating with full agency in a truly free market. So, while it is obvious that all of us are made more valuable to business, and to ourselves, because we use browsers, phones, email, and messaging, we can’t say that we are free while we do.

But the Internet is still young: dating in its current form—supportive of e-commerce—since 30 April 1995, when the NSFNET (one of the Internet’s backbones) was decommissioned, and its policy forbidding commercial traffic on its pipes no longer stood in the way. The Net will also be with us for dozens or hundreds of decades to come, with its base protocol, TCP/IP, continuing to support freedom for every node on it.

More importantly, there are many business problems best or only solved from the customer side. Here is a list:

Identity. Logins and passwords are burdensome leftovers from the last millennium. There should be (and already are) better ways to identify ourselves by revealing to others only what we need them to know. Working on this challenge is the SSI—Self-Sovereign Identity—movement. (Which also goes by many other names. The latest is Web5.) The solution here for individuals is tools of their own that scale. Note that there is a LOT happening here. One good way keep up with it is in the Identisphere newsletter. You can also participate by attending the twice-yearly Internet Identity Workshop, which has been going strong since 2005.
Subscriptions. Nearly all subscriptions are pains in the butt. “Deals” can be deceiving, full of conditions and changes that come without warning. New customers often get better deals than loyal customers. And there are no standard ways for customers to keep track of when subscriptions run out, need renewal, or change. The only way this can be normalized is from the customers’ side.
Terms and conditions. In the world today, nearly all of these are ones that companies proffer; and we have little or no choice about agreeing to them. Worse, in nearly all cases, the record of agreement is on the company’s side. Oh, and since the GDPR came along in Europe and the CCPA in California, entering a website has turned into an ordeal typically requiring “consent” to privacy violations the laws were meant to stop. Or worse, agreeing that a site or a service provider spying on us is a “legitimate interest.” The solution here is terms individuals can proffer and organizations can agree to. The first of these is #NoStalking, and allows a publisher to do all the advertising they want, so long as it’s not based on tracking people. Think of it as the opposite of an ad blocker. (Customer Commons is also involved in the IEEE’s P7012 Standard for Machine Readable Personal Privacy Terms.
Payments. For demand and supply to be truly balanced, and for customers to operate at full agency in an open marketplace (which the Internet was designed to support), customers should have their own pricing gun: a way to signal—and actually pay willing sellers—as much as they like, however, they like, for whatever they like, on their own terms. There is already a design for that, called EmanciPay. Its promise for the music industry alone is enormous.
Intentcasting. Advertising is all guesswork, which involves massive waste. But what if customers could safely and securely advertise what they want, and only to qualified and ready sellers? This is called intentcasting, and to some degree, it already exists. Toward this, the Intention Byway is a core focus of Customer Commons. (Also see a list of intentcasting providers on the ProjectVRM Development Work list.)
Shopping. Why can’t you have your own shopping cart—that you can take from store to store? Because we haven’t invented one yet. But we can. And when we do, all sellers are likely to enjoy more sales than they get with the current system of all-silo’d carts.
Internet of Things. We don’t have this yet. Instead, we have the Apple of things, the Amazon of things, the Google of things, the Samsung of things, the Sonos of things, and so on, each silo’d in separate systems we don’t control. Things we own on the Internet should be our things. We should be able to control them, as independent operators, as we do with our computers and mobile devices. (Also, by the way, things don’t need to be intelligent or connected to belong to the Internet for us to control what’s known about them. They can be, or have, picos.)
Loyalty. All loyalty programs are gimmicks, and coercive. True loyalty is worth far more to companies than the coerced kind, and only customers are in a position to truly and fully express it. We should have our own loyalty programs, to which companies are members, rather than the reverse.
Privacy. We’ve had privacy tech in the physical world since the inventions of clothing, shelter, locks, doors, shades, shutters, and other ways to limit what others can see or hear—and to signal to others what’s okay and what’s not. Instead, all we have are unenforced promises by others not to watch our naked selves, or to report what they see to others. Or worse, coerced urgings to “accept” spying on us and distributing harvested information about us to parties unknown, with no record of what we’ve agreed to.
Customer service. There are no standard ways for customers and companies to enjoy relationships, with useful data flowing both ways, and for help to come when it’s needed. Instead, every company does it differently, in its own silo’d system. For more on this, see # 12 below.
Regulatory compliance. Especially around privacy. Because really, all the GDPR and the CCPA want is for companies to stop spying on people. Without any privacy tech on the individual’s side, however, responsibility for everyone’s privacy is entirely a corporate burden. This is unfair to people and companies alike, as well as insane—because it can’t work. (Worse, nearly all B2B “compliance” solutions only solve the felt need by companies to obey the letter of a law while ignoring its spirit. But if people have their own ways to signal their privacy requirements and expectations (as they do with clothing and shelter in the natural world), life gets a lot easier for everybody, because there’s something there to respect. We don’t have that yet online, but it shouldn’t be hard. For more on this, see Privacy is Personal and our own Privacy Manifesto.
Real relationships: ones in which both parties actually care about and help each other, and good market intelligence flows both ways. Marketing by itself can’t do it. All you get is the sound of one hand slapping. (Or, more typically, pleasuring itself with mountains of data and fanciful maths first described in Darrell Huff’s How to Lie With Statistics, written in 1954). Sales departments can’t do it either, because their job is done once the relationship is established. CRM can’t do it without a VRM hand to shake on the customer’s side. From What Makes a Good Customer: “Consider the fact that a customer’s experience with a product or service is far more rich, persistent and informative than is the company’s experience selling those things, or learning about their use only through customer service calls (or even through pre-installed surveillance systems such as those which for years now have been coming in new cars). The curb weight of customer intelligence (knowledge, know-how, experience) with a company’s products and services far outweighs whatever the company can know or guess at. So, what if that intelligence were to be made available by the customer, independently, and in standard ways that work at scale across many or all of the companies the customer deals with?”
Any-to-any/many-to-many business: a market environment where anybody can easily do business with anybody else, mostly free of centralizers or controlling intermediaries (with due respect for inevitable tendencies toward federation). There is some movement in this direction around what’s being called Web3.
Life management platforms. KuppingerCole has been writing and thinking about these since not long after they gave ProjectVRM an award for its work, way back in 2007. These have gone by many labels: personal data clouds, vaults, dashboards, cockpits, lockers, and other ways of characterizing personal control of one’s life where it meets and interacts with the digital world. The personal data that matters in these is the kind that matters in one’s life: health (e.g. HIEofOne), finances, property, subscriptions, contacts, calendar, creative works, and so on, including personal archives for all of it. Social data out in the world also matters, but is not the place to start, because that data is less important than the kinds of personal data listed above—most of which has no business being sold or given away for goodies from marketers. (See We can do better than selling our data.)

All of these, however, are ocean-boiling ideas. In other words, not easy, especially without what the military calls “robust funding.” So our strategies are best aimed toward what are called “blue” rather than “red” (blood filled) oceans. One of those is the Byway (or “buyway”) project by Customer Commons, in Bloomington, Indiana. An excerpt:

There are three parts to the Byway project as it now stands (in July 2022): an online community (Small Town/mastodon), a matcher tool (Intently), and a local e-commerce “buyway.” (For more on that one, download the slide deck presented by Doc and Joyce at The Mill in November 2021. Or download this earlier and shorter one.)

We also see the Byway as complementary to, rather than competitive with, developments with similar and overlapping ambitions, such as SSI, DIDcomm, picos, JLINC, Digital Homesteading / Dazzle and many others.

Joyce and I, both founders and board members of Customer Commons, are heading up to DWeb Camp in a few minutes, and plan to make progress there on Byway development. I’ll report here on progress.

[Later…] DWeb Camp was a great success for us. We are now in planning conversations with developers and others. Stay tuned for more on that.

How the Web sucks

September 11, 2021 / Doc Searls / 0 Comments

This spectrum of emojis is a map of the Web’s main occupants (the middle three) and outliers (the two on the flanks). It provides a way of examining who is involved, where regulation fits, and where money gets invested and made. Yes, it’s overly broad, but I think it’s helpful in understanding where things went wrong and why. So let’s start.

Wizards are tech experts who likely run their own servers and keep private by isolating themselves and communicating with crypto. They enjoy the highest degrees of privacy possible on and around the Web, and their approach to evangelizing their methods is to say “do as I do” (which most of us, being Muggles, don’t). Relatively speaking, not much money gets made by or invested in Wizards, but much money gets made because of Wizards’ inventions. Those inventions include the Internet, the Web, free and open source software, and much more. Without Wizards, little of what we enjoy in the digital world today would be possible. However, it’s hard to migrate their methods into the muggle population.

‍Muggles are the non-Wizards who surf the Web and live much of their digital lives there, using Web-based services on mobile apps and browsers on computers. Most of the money flowing into the webbed economy comes from Muggles. Still, there is little investment in providing Muggles with tools for operating or engaging independently and at scale across the websites and services of the world. Browsers and email clients are about it, and the most popular of those (Chrome, Safari, Edge) are by the grace of corporate giants. Almost everything Muggles do on the Web and mobile devices is on apps and tools that are what the trade calls silos or walled gardens: private spaces run by the websites and services of the world.

Sites. This category also includes clouds and the machinery of e-commerce. These are at the heart of the Web: a client-server (aka calf-cow) top-down, master-slave environment where servers rule and clients obey. It is in this category that most of the money on the Web (and e-commerce in general) gets made, and into which most investment money flows. It is also here that nearly all development n the connected world today happens.

Ad-tech, aka adtech, is the home of surveillance capitalism, which relies on advertisers and their agents knowing all that can be known about every Muggle. This business also relies on absent Muggle agency, and uses that absence as an excuse for abusing the privilege of committing privacy violations that would be rude or criminal in the natural world. Also involved in this systematic compromise are adtech’s dependents in the websites and Web services of the world, which are typically employed by adtech to inject tracking beacons in Muggles’ browsers and apps. It is to the overlap between adtech and sites that all privacy regulation is addressed. This is why, the GDPR sees Muggles as mere “data subjects,” and assigns responsibility for Muggle’s privacy to websites and services the regulation calls “data controllers” and “data processors.” The regulation barely imagines that Muggles could perform either of those roles, even though personal computing was invented so every person can do both. (By the way, the adtech business and many of its dependents in publishing like to say the Web is free because advertising pays for it. But the Web is as free by nature as are air and sunlight. And most of the money Google makes, for example, comes from plain old search advertising, which can get along fine without tracking. There is also nothing about advertising itself that requires tracking.)

Crime happens on the Web, but its center of gravity is outside, on the dark web. This is home to botnets, illegal porn, terrorist activity, ransom attacks, cyber espionage, and so on. There is a lot of overlap between crime and adtech, however, given the moral compromises required for adtech to function, plus the countless ways that bots, malware and other types of fraud are endemic to the adtech business. (Of course, to be an expert criminal on the dark web requires a high degree of wizardry. So I one could arrange these categories in a circle, with an overlap between wizards and criminals.)

I offer this set of distinctions for several reasons. One is to invite conversation about how we have failed the Web and the Web has failed us—the Muggles of the world—even though we enjoy apparently infinite goodness from the Web and handy services there. Another is to explain why ProjectVRM has been more aspirational than productive in the fifteen years it has been working toward empowering people on the commercial Net. (Though there has been ample productivity.) But mostly it is to explain why I believe we will be far more productive if we start working outside the Web itself. This is why our spinoff, Customer Commons, is pushing forward with the Byway toward i-commerce. Check it out.

Finally, I owe the idea for this visualization to Iain Henderson, who has been with ProjectVRM since before it started. (His other current involvements are with JLINC and Customer Commons.) Hope it proves useful.

Toward e-commerce 2.0

March 25, 2021 / Doc Searls / 0 Comments

Phil Windley explains e-commerce 1.0 in a single slide that says this:

One reason this happened is that client-server, aka calf-cow (illustrated in Thinking outside the browser) has been the default format for all relationships on the Web, and cookies are required to maintain those relationships. The result is a highly lopsided power asymmetry in which the calves have no more power than the cows give them. As a result,

The calves have no easy way even to find (much less to understand or create) the cookies in their browsers’ jars.
The calves have no identity of their own, but instead have as many different identities as there are websites that know (via cookies) their visiting browsers. This gives them no independence, much less a place to stand like Archimedes, with a lever on the world. The browser may be a great tool, but it’s neither that place to stand, nor a sufficient lever. (Yes, it should have been, and maybe still could be; but meanwhile, it isn’t.)
All the “agreements” the calves have with the websites’ cows leave no readable record on the calves’ side. This severely limits their capacity for dispute, which is required for a true relationship.
There exists no independent way the calves to signal their intentions—such as interests in purchase, conditions for engagement, or the need to be left alone (which is how Brandeis and Warren define privacy).

In other words, the best we can do in e-commerce 1.0 is what the calf-cow system provides: ways for calves to depend utterly on means the cows provide. And some of those cows are mighty huge.

Nearly all of signaling between demand and supply remains trapped inside these silos and walled gardens. We search inside their systems, we are notified of product and service availability inside their systems, we make agreements inside their systems (to terms and conditions they provide and require), or privacy is dependent on their systems, and product and service delivery is handled either inside their systems or through allied and dependent systems.

Credit where due: an enormous amount of good has come out of these systems. But a far larger amount of good is MLOTT—money left on the table—because there is a boundless sum and variety of demand and supply that still cannot easily signal their interest, intentions of presence to each other in the digital world.

Putting that money on the table is our job in e-commerce 2.0.

So here is a challenge: tell us how we can do that without using browsers.

Some of us here do have ideas. But we’d like to hear from you first.

Cross-posted at the ProjectVRM blog, here.

What if we called cookies “worms”?

May 26, 2020 / Doc Searls / 8 Comments

While you ponder that, read Exclusive: New York Times phasing out all 3rd-party advertising data, by Sara Fischer in Axios.

The cynic in me translates the headline as “Leading publishers cut out the middle creep to go direct with tracking-based advertising.” In other words, same can, nicer worms.

But maybe that’s wrong. Maybe we’ll only be tracked enough to get put into one of those “45 new proprietary first-party audience segments” or “at least 30 more interest segments.” And maybe only tracked on site.

But we will be tracked, presumably. Something needs to put readers into segments. What else will do that?

So, here’s another question: Will these publishers track readers off-site to spy on their “interests” elsewhere? Or will tracking be confined to just what the reader does while using the site?

Anyone know?

In a post on the ProjectVRM list, Adrian Gropper says this about the GDPR (in response to what I posted here): “GDPR, like HIPAA before it, fails because it allows an unlimited number of dossiers of our personal data to be made by unlimited number of entities. Whether these copies were made with consent or without consent through re-identification, the effect is the same, a lack of transparency and of agency.”

So perhaps it’s progress that these publishers (the Axios story mentions The Washington Post and Vox as well as the NYTimes) are only keeping limited dossiers on their readers alone.

But that’s not progress enough.

We need global ways to say to every publisher how little we wish them to know about us. Also ways to keep track of what they actually do with the information they have. (And we’re working on those. )

Being able to have one’s data back (e.g. via the CCPA) is a kind of progress (as is the law’s discouragement of collection in the first place), but we need technical as well as legal mechanisms for projecting personal agency online. (Models for this are Archimedes and Marvel heroes.) Not just more ways to opt out of being observed more than we’d like—especially when we still lack ways to audit what others do with the permissions we give them.

That’s the only way we’ll get rid of the worms.

Bonus link.

Markets as conversations with robots

February 5, 2020 / Doc Searls / 7 Comments

From the Google AI blog: Towards a Conversational Agent that Can Chat About…Anything:

In “Towards a Human-like Open-Domain Chatbot”, we present Meena, a 2.6 billion parameter end-to-end trained neural conversational model. We show that Meena can conduct conversations that are more sensible and specific than existing state-of-the-art chatbots. Such improvements are reflected through a new human evaluation metric that we propose for open-domain chatbots, called Sensibleness and Specificity Average (SSA), which captures basic, but important attributes for human conversation. Remarkably, we demonstrate that perplexity, an automatic metric that is readily available to any neural conversational models, highly correlates with SSA.

A chat between Meena (left) and a person (right).

Meena

Meena is an end-to-end, neural conversational model that learns to respond sensibly to a given conversational context. The training objective is to minimize perplexity, the uncertainty of predicting the next token (in this case, the next word in a conversation). At its heart lies the Evolved Transformer seq2seq architecture, a Transformer architecture discovered by evolutionary neural architecture search to improve perplexity.

Concretely, Meena has a single Evolved Transformer encoder block and 13 Evolved Transformer decoder blocks as illustrated below. The encoder is responsible for processing the conversation context to help Meena understand what has already been said in the conversation. The decoder then uses that information to formulate an actual response. Through tuning the hyper-parameters, we discovered that a more powerful decoder was the key to higher conversational quality.

So how about turning this around?

What if Google sold or gave a Meena model to people—a model Google wouldn’t be able to spy on—so people could use it to chat sensibly with robots or people at companies?

Possible?

If, in the future (which is now—it’s freaking 2020 already), people will have robots of their own, why not one for dealing with companies, which themselves are turning their sales and customer service systems over to robots anyway?

We’re not data. We’re digital. Let’s research that.

November 17, 2019 / Doc Searls / 1 Comment

The University of Chicago Press’ summary of How We Became Our Data says author Colin Koopman—

excavates early moments of our rapidly accelerating data-tracking technologies and their consequences for how we think of and express our selfhood today. Koopman explores the emergence of mass-scale record keeping systems like birth certificates and social security numbers, as well as new data techniques for categorizing personality traits, measuring intelligence, and even racializing subjects. This all culminates in what Koopman calls the “informational person” and the “informational power” we are now subject to. The recent explosion of digital technologies that are turning us into a series of algorithmic data points is shown to have a deeper and more turbulent past than we commonly think.

Got that? Good.

Now go over to the book’s Amazon page, do the “look inside” thing and then go to the chapter titled “Redesign: Data’s Turbulent Pasts and Future Paths” (p. 173) and read forward through the next two pages (which is all it allows). In that chapter, Koopman begins to develop “the argument that information politics is separate from communicative politics.” My point with this is that politics are his frames (or what he calls “embankments”) in both cases.

Now take three minutes for A Smart Home Neighborhood: Residents Find It Enjoyably Convenient Or A Bit Creepy, which ran on NPR one recent morning. It’s about a neighborhood of Amazon “smart homes” in a Seattle suburb. Both the homes and the neighborhood are thick with convenience, absent of privacy, and reliant on surveillance—both by Amazon and by smart homes’ residents. In the segment, a guy with the investment arm of the National Association of Realtors says, “There’s a new narrative when it comes to what a home means.” The reporter enlarges on this: “It means a personalized environment where technology responds to your every need. Maybe it means giving up some privacy. These families are trying out that compromise.” In one case the teenage daughter relies on Amazon as her “butler,” while her mother walks home on the side of the street without Amazon doorbells, which have cameras and microphones, so she can escape near-ubiquitous surveillance in her smart ‘hood.

Lets visit three additional pieces. (And stay with me. There’s a call to action here, and I’m making a case for it.)

First, About face, a blog post of mine that visits the issue of facial recognition by computers. Like the smart home, facial recognition is a technology that is useful both for powerful forces outside of ourselves—and for ourselves. (As, for example, in the Amazon smart home.) To limit the former (surveillance by companies), it typically seems we need to rely on what academics and bureaucrats blandly call policy (meaning public policy: principally lawmaking and regulation).

As this case goes, the only way to halt or slow surveillance of individuals by companies is to rely on governments that are also incentivized (to speed up passport lines, solve crimes, fight terrorism, protect children, etc.) to know as completely as possible what makes each of us unique human beings: our faces, our fingerprints, our voices, the veins in our hands, the irises of our eyes. It’s hard to find a bigger hairball of conflicting interests and surely awful outcomes.

Second, What does the Internet make of us, where I conclude with this:

My wife likens the experience of being “on” the Internet to one of weightlessness. Because the Internet is not a thing, and has no gravity. There’s no “there” there. In adjusting to this, our species has around two decades of experience so far, and only about one decade of doing it on smartphones, most of which we will have replaced two years from now. (Some because the new ones will do 5G, which looks to be yet another way we’ll be captured by phone companies that never liked or understood the Internet in the first place.)

But meanwhile we are not the same. We are digital beings now, and we are being made by digital technology and the Internet. No less human, but a lot more connected to each other—and to things that not only augment and expand our capacities in the world, but replace and undermine them as well, in ways we are only beginning to learn.

Third, Mark Stahlman’s The End of Memes or McLuhan 101, in which he suggests figure/ground and formal cause as bigger and deeper ways to frame what’s going on here. As Mark sees it (via those two frames), the Big Issues we tend to focus on—data, surveillance, politics, memes, stories—are figures on a ground that formally causes all of their forms. (The form in formal cause is the verb to form.) And that ground is digital technology itself. Without digital tech, we would have little or none of the issues so vexing us today.

The powers of digital tech are like those of speech, tool-making, writing, printing, rail transport, mass production, electricity, railroads, automobiles, radio and television. As Marshall McLuhan put it (in The Medium is the Massage), each of new technology is a cause that “works us over completely” while it’s busy forming and re-forming us and our world.

McLuhan also teaches that each new technology retrieves what remains useful about the technologies it obsolesces. Thus writing retrieved speech, printing retrieved writing, radio retrieved both, and TV retrieved radio. Each new form was again a formal cause of the good and bad stuff that worked over people and their changed worlds. (In modern tech parlance, we’d call the actions of formal cause “disruptive.”)

Digital tech, however, is less disruptive and world-changing than it is world-making. In other words, it is about as massively formal (as in formative) as tech can get. And it’s as hard to make sense of this virtual world than it is to sense roundness in the flat horizons of our physical one. It’s also too easy to fall for the misdirections inherent in all effects of formal causes. For example, it’s much easier to talk about Trump than about what made him possible. Think about it: absent of digital tech, would we have had Trump? Or even Obama? McLuhan’s blunt perspective may help. “People,” he said, “do not want to know why radio caused Hitler and Gandhi alike.”

So here’s where I am now on all this:

We have not become data. We have become digital, while remaining no less physical. And we can’t understand what that means if we focus only on data. Data is more effect than cause.
Politics in digital conditions is almost pure effect, and those effects misdirect our attention away from digital as a formal cause. To be fair, it is as hard for us to get distance on digital as it is for a fish to get distance on water. (David Foster Wallace to the Kenyon College graduating class of 2005: Greetings parents and congratulations to Kenyon’s graduating class of 2005. There are these two young fish swimming along and they happen to meet an older fish swimming the other way, who nods at them and says “Morning, boys. How’s the water?” And the two young fish swim on for a bit, and then eventually one of them looks over at the other and goes “What the hell is water?”)
Looking to policy for cures to digital ills is both unavoidable and sure to produce unintended consequences. For an example of both, look no farther than the GDPR. In effect (so far), it has demoted human beings to mere “data subjects,” located nearly all agency with “data controllers” and “data processors,” has done little to thwart unwelcome surveillance, and has caused boundlessly numerous, insincere and misleading “cookie notices”—almost all of which are designed to obtain “consent” to what the regulation was meant to stop. In the process it has also called into being monstrous new legal and technical enterprises, both satisfying business market demand for ways to obey the letter of the GDPR while violating its spirit. (Note: there is still hope for applying the the GDPR. But let’s get real: demand in the world of sites and services for violating the GDPR’s spirit, and for persisting in the practice of surveillance capitalism, far exceeds demand for compliance and true privacy-respecting behavior. Again, so far.)
Power is moving to the edge. That’s us. Yes, there is massive concentration of power and money in the hands of giant companies on which we have become terribly dependent. But there are operative failure modes in all those companies, and digital tech remains ours no less than theirs.

I could make that list a lot longer, but that’s enough for my main purpose here, which is to raise the topic of research.

ProjectVRM was conceived in the first place as a development and research effort. As a Berkman Klein Center project, in fact, it has something of an obligation to either do research, or to participate in it.

We’ve encouraged development for thirteen years. Now some of that work is drifting over to the Me2B Alliance which has good leadership, funding and participation. There is also good energy in the IEEE 7012 working group and Customer Commons, both of which owe much to ProjectVRM.

So perhaps now is a good time to start at least start talking about research. Two possible topics: facial recognition and smart homes. Anyone game?

What turns out to be a draft version of this post ran on the ProjectVRM list. If you’d like to help, please subscribe and join in on that link. Thanks.

Personal scale

April 27, 2019 / Doc Searls / 3 Comments

Way back in 1995, when our family was still new to the Web, my wife asked a question that is one of the big reasons I started ProjectVRM: Why can’t I take my own shopping cart from one site to another?

The bad but true answer is that every site wants you to use their shopping cart. The good but not-yet-true answer is that nobody has invented it yet. By that I mean: not a truly personal one, based on open standards that make it possible for lots of developers to compete at making the best personal shopping cart for you.

Think about what you might be able to do with a PSC (Personal Shopping Cart) online that you can’t do with a physical one offline:

Take it from store to store, just as you do with your browser. This should go without saying, but it’s still worth repeating, because it would be way cool.
Have a list of everything parked already in your carts within each store.
Know what prices have changed, or are about to change, for the products in your carts in each store.
Notify every retailer you trust that you intend to buy X, Y or Z, with restrictions (meaning your terms and conditions) on the use of that information, and in a way that will let you know if those restrictions are violated. This is called intentcasting, and there are a pile of companies already in that business.
Have a way to change your name and other contact information, for all the stores you deal with, in one move.
Control your subscriptions to each store’s emailings and promotional materials.
Have your own way to express genuine loyalty , rather than suffering with as many coercive and goofy “loyalty programs” as there are companies
Have a standard way to share your experiences with the companies that make and sell the products you’ve bought, and to suggest improvements—and for those companies to share back updates and improvements you should know about.
Have wallets of your own, rather than only those provided by platforms.
Connect to your collection of receipts, instruction manuals and other relevant information for all the stuff you’ve already bought or currently rent. (Note that this collection is for the Internet of your things—one you control for yourself, and is not a set of suction cups on corporate tentacles.)
Your own standard way to call for service or support, for stuff you’ve bought or rented, rather than suffering with as many different ways to do that as there are companies you’ve engaged

All of these things are Me2B, and will give each of us scale, much as the standards that make the Internet, browsers and email all give us scale. And that scale will be just as good for the companies we deal with as are the Internet, browsers and email.

If you think “none of the stores out there will want any of this, because they won’t control it,” think about what personal operating systems and browsers on every device have already done for stores by making the customer interface standard. What we’re talking about here is enlarging that interface.

I’d love to see if there is any economics research and/or scholarship on personal scale and its leverage (such as personal operating systems, devices and browsers give us) in the digital world). Because it’s a case that needs to be made.

Of course, there’s money to me made as well, because there will be so many more, better and standard ways for companies to deal with customers than current tools (including email, apps and browsers) can by themselves.

The Wurst of the Web

March 23, 2019 / Doc Searls / 1 Comment

Don’t think about what’s wrong on the Web. Think about what pays for it. Better yet, look at it.

Start by installing Privacy Badger in your browser. Then look at what it tells you about every site you visit. With very few exceptions (e.g. Internet Archive and Wikipedia), all are putting tracking beacons (the wurst cookie flavor) in your browser. These then announce your presence to many third parties, mostly unknown and all unseen, at nearly every subsequent site you visit, so you can be followed and profiled and advertised at. And your profile might be used for purposes other than advertising. There’s no way to tell.

This practice—tracking people without their invitation or knowledge—is at the dark heart and sold soul of what Shoshana Zuboff calls Surveillance Capitalism and Brett Frischmann and Evan Selinger call Re-engineering Humanity. (The italicized links go to books on the topic, both of which came out in the last year. Buy them.)

While that system’s business is innocuously and misleadingly called advertising, the surveilling part of it is called adtech. The most direct ancestor of adtech is not old fashioned brand advertising. It’s direct marketing, best known as junk mail. (I explain the difference in Separating Advertising’s Wheat and Chaff.)

In the online world, brand advertising and adtech look the same, but underneath they are as different as bread and dirt. While brand advertising is aimed at broad populations and sponsors media it considers worthwhile, adtech does neither. Like junk mail, adtech wants to be personal, wants a direct response, and ignores massive negative externalities. It also uses media to mark, track and advertise at eyeballs, wherever those eyeballs might show up. (This is how, for example, a Wall Street Journal reader’s eyeballs get shot with an ad for, say, Warby Parker, on Breitbart.) So adtech follows people, profiles them, and adjusts its offerings to maximize engagement, meaning getting a click. It also works constantly to put better crosshairs on the brains of its human targets; and it does this for both advertisers and other entities interested in influencing people. (For example, to swing an election.)

For most reporters covering this, the main objects of interest are the two biggest advertising intermediaries in the world: Facebook and Google. That’s understandable, but they’re just the tip of the wurstberg. Also, in the case of Facebook, it’s quite possible that it can’t fix itself. See here:

How easy do you think it is for Facebook to change: to respond positively to market and regulatory pressures?

Consider this possibility: it can’t.

One reason is structural. Facebook is comprised of many data centers, each the size of a Walmart or few, scattered around the world and costing many $billions to build and maintain. Those data centers maintain a vast and closed habitat where more than two billion human beings share all kinds of revealing personal shit about themselves and each other, while providing countless ways for anybody on Earth, at any budget level, to micro-target ads at highly characterized human targets, using up to millions of different combinations of targeting characteristics (including ones provided by parties outside Facebook, such as Cambridge Analytica, which have deep psychological profiles of millions of Facebook members). Hey, what could go wrong?

In three words, the whole thing.

The other reason is operational. We can see that in how Facebook has handed fixing what’s wrong with it over to thousands of human beings, all hired to do what The Wall Street Journal calls “The Worst Job in Technology: Staring at Human Depravity to Keep It Off Facebook.” Note that this is not the job of robots, AI, ML or any of the other forms of computing magic you’d like to think Facebook would be good at. Alas, even Facebook is still a long way from teaching machines to know what’s unconscionable. And can’t in the long run, because machines don’t have a conscience, much less an able one.

You know Goethe’s (or hell, Disney’s) story of The Sorceror’s Apprentice? Look it up. It’ll help. Because Mark Zuckerberg is both the the sorcerer and the apprentice in the Facebook version of the story. Worse, Zuck doesn’t have the mastery level of either one.

Nobody, not even Zuck, has enough power to control the evil spirits released by giant machines designed to violate personal privacy, produce echo chambers beyond counting and amplify tribal prejudices (including genocidal ones)—besides whatever good it does for users and advertisers.

The hard work here is lsolving the problems that corrupted Facebook so thoroughly, and are doing the same to all the media that depend on surveillance capitalism to re-engineer us all.

Meanwhile, because lawmaking is moving apace in any case, we should also come up with model laws and regulations that insist on respect for private spaces online. The browser is a private space, so let’s start there.

Here’s one constructive suggestion: get the browser makers to meet next month at IIW, an unconference that convenes twice a year at the Computer History Museum in Silicon Valley, and work this out.

Ann Cavoukian (@AnnCavoukian) got things going on the organizational side with Privacy By Design, which is now also embodied in the GDPR. She has also made clear that the same principles should apply on the individual’s side. So let’s call the challenge there Privacy By Default. And let’s have it work the same in all browsers.

I think it’s really pretty simple: the default is no. If we want to be tracked for targeted advertising or other marketing purposes, we should have ways to opt into that. But not some modification of the ways we have now, where every @#$%& website has its own methods, policies and terms, none of which we can track or audit. That is broken beyond repair and needs to be pushed off a cliff.

Among the capabilities we need on our side are 1) knowing what we have opted into, and 2) ways to audit what is done with information we have given to organizations, or has been gleaned about us in the course of our actions in the digital world. Until we have ways of doing both, we need to zero-base the way targeted advertising and marketing is done in the digital world. Because spying on people without an invitation or a court order is just as wrong in the digital world as it is in the natural one. And you don’t need spying to target.

And don’t worry about lost business. There are many larger markets to be made on the other side of that line in the sand than we have right now in a world where more than 2 billion people block ads, and among the reasons they give are “Ads might compromise my online privacy,” and “Stop ads being personalized.”

Those markets will be larger because incentives will be aligned around customer agency. And they’ll want a lot more from the market’s supply side than surveillance based sausage, looking for clicks.

The only path from subscription hell to subscription heaven

September 21, 2018 / Doc Searls / 1 Comment

I subscribe to Vanity Fair. I also get one of its newsletters, replicated on a website called The Hive. At the top of the latest Hive is this come-on: “For all that and more, don’t forget to sign up for our metered paywall, the greatest innovation since Nitroglycerin, the Allman Brothers, and the Hangzhou Grand Canal.”

When I clicked on the metered paywall link, it took me to a plain old subscription page. So I thought, “Hey, since they have tracking cruft appended to that link, shouldn’t it take me to a page that says something like, “Hi, Doc! Thanks for clicking, but we know you’re already a paying subscriber, so don’t worry about the paywall”?

So I clicked on the Customer Care link to make that suggestion. This took me to a login page, where my password manager filled in the blanks with one of my secondary email addresses. That got me to my account, which says my Condé Nast subscriptions look like this:

Oddly, the email address at the bottom there is my primary one, not the one I just logged in with. (Also oddly, I still get Wired.)

So I went to the Vanity Fair home page, found myself logged in there, and clicked on “My Account.” This took me to a page that said my email address was my primary one, and provided a way to change my password, to subscribe or unsubscribe to four newsletters, and a way to “Receive a weekly digest of stories featuring the players you care about the most.” The link below said “Start following people.” No way to check my account itself.

So I logged out from the account page I reached through the Customer Care link, and logged in with my primary email address, again using my password manager. That got me to an account page with the same account information you see above.

It’s interesting that I have two logins for one account. But that’s beside more important points, one of which I made with this message I wrote for Customer Care in the box provided for that:

Curious to know where I stand with this new “metered paywall” thing mentioned in the latest Hive newsletter. When I go to the link there — https://subscribe.condenastdigital.com/subscribe/splits/vanityfair/ — I get an apparently standard subscription page. I’m guessing I’m covered, but I don’t know. Also, even as a subscriber I’m being followed online by 20 or more trackers (reports Privacy Badger), supposedly for personalized advertising purposes, but likely also for other purposes by Condé Nast’s third parties. (Meaning not just Google, Facebook and Amazon, but Parsely and indexww, which I’ve never heard of and don’t trust. And frankly I don’t trust those first three either.) As a subscriber I’d want to be followed only by Vanity Fair and Condé Nast for their own service-providing and analytic purposes, and not by who-knows-what by all those others. If you could pass that request along, I thank you. Cheers, Doc

When I clicked on the Submit button, I got this:

An error occurred while processing your request.An error occurred while processing your request.

Please call our Customer Care Department at 1-800-667-0015 for immediate assistance or visit Vanity Fair Customer Care online.

Invalid logging session ID (lsid) passed in on the URL. Unable to serve the servlet you’ve requested.

So there ya go: one among .X zillion other examples of subscription hell, differing only in details.

Fortunately, there is a better way. Read on.

The Path

The only way to pave a path from subscription and customer service hell to the heaven we’ve never had is by normalizing the ways both work, across all of business. And we can only do this from the customer’s side. There is no other way. We need standard VRM tools to deal with the CRM and CX systems that exist on the providers’ side.

We’ve done this before.

We fixed networking, publishing and mailing online with the simple and open standards that gave us the Internet, the Web and email. All those standards were easy for everyone to work with, supported boundless economic and social benefits, and began with the assumption that individuals are full-privilege agents in the world.

The standards we need here should make each individual subscriber the single point of integration for their own data, and the responsible party for changing that data across multiple entities. (That’s basically the heart of VRM.)

This will give each of us a single way to see and manage many subscriptions, see notifications of changes by providers, and make changes across the board with one move. VRM + CRM.

The same goes for customer care service requests. These should be normalized the same way.

In the absence of normalizing how people manage subscription and customer care relationships, all the companies in the world with customers will have as many different ways of doing both as there are companies. And we’ll languish in the login/password hell we’re in now.

The VRM+CRM cost savings to those companies will also be enormous. For a sense of that, just multiply what I went through above by as many people there are in the world with subscriptions, and multiply that result by the number of subscriptions those people have — and then do the same for customer service.

We can’t fix this inside the separate CRM systems of the world. There are too many of them, competing in too many silo’d ways to provide similar services that work differently for every customer, even when they use the same back-ends from Oracle, Salesforce, SugarCRM or whomever.

Fortunately, CRM systems are programmable. So I challenge everybody who will be at Salesforce’s Dreamforce conference next week to think about how much easier it will be when individual customers’ VRM meets Salesforce B2B customers’ CRM. I know a number of VRM people who will be there, including Iain Henderson, of the bonus link below. Let me know you’re interested and I’ll make the connection.

And come work with us on standards. Here’s one.

Bonus link: Me-commerce — from push to pull, by Iain Henderson (@iaianh1)

🧑‍💼 Jizhi Zhang

🧑‍🔬 Keqin Bao, Wenjie Wang, Yang Zhang, Wentao Shi, Fuli Feng, Tat‑Seng Chua

🧑‍🎓 Wanhong Xu

🧠 Broader Context & Contributions

🧭 Why This Matters

📚 Academic Scope

1. Established Field: Recommender Systems

📈 Current Trends & Expanding Frontiers

2. LLMs + Recommendation = New Subfield

3. Multimodal & Multitask Integration

4. Industry Investment

🧭 Estimate of Scope

🧠 In Summary

🧠 1. Kwaai and pAI-OS (South Africa / Global)

🛠️ 2. Indie AI and Personal Agent projects

Examples:

🌐 3. Solid Project (by Tim Berners-Lee)

📜 4. IEEE P7012 – MyTerms

🧩 5. Project VRM (Vendor Relationship Management)

🔒 6. Open-source LLM stacks for local use

🧭 Summary

🚧 The Gap

👤 People Leading or Advancing the Personal AI Revolution

🧑‍🔬 Tim Berners-Lee

👤 Doc Searls

🧠 Tawanda Kembo & Kwaai team

🧑‍💻 Simon Willison

🤖 Brian Roemmele

🛠️ A DIY Personal AI Stack (2025 Edition)

🧩 Core Components

🧠 Optional Add-ons

🧭 What’s Still Missing?

🧰 YOUR PERSONAL AI STARTER STACK (2025)

🔧 1. Platform Base

🤖 2. Run a Local LLM with LM Studio

📁 3. Chat with Your Files (PrivateGPT / LlamaIndex)

📚 4. Knowledge Base + Journaling with Obsidian

🗂️ 5. Personal Data Management with Datasette

🌐 6. Optional: Browser Helper Agent (Local, Simple)

🔒 7. Privacy Boosters

🧭 Putting It All Together

🪜 Next Steps (in plain order)

📁 Recommended Top-Level Folder Structure

🧠 Strategy for Managing It All

1. Use Consistent File Types

2. Naming Conventions (Machine + Human Friendly)

3. Data Capture Tools

4. Querying and Reviewing

✅ Summary of Benefits

Recent Posts

Recent Comments

Archives

Categories

🧑‍🔬 Keqin Bao, Wenjie Wang, Yang Zhang, Wentao Shi, Fuli Feng, Tat‑Seng Chua

🧑‍🎓 Wanhong Xu