Startups

Xayn is privacy-safe, personalized mobile web search powered by on-device AIs

Comment

As TC readers know, the tricky trade-off of the modern web is privacy for convenience. Online tracking is how this ‘great intimacy robbery’ is pulled off. Mass surveillance of what Internet users are looking at underpins Google’s dominant search engine and Facebook’s social empire, to name two of the highest profile ad-funded business models.

TechCrunch’s own corporate overlord, Verizon, also gathers data from a variety of end points — mobile devices, media properties like this one — to power its own ad targeting business.

Countless others rely on obtaining user data to extract some perceived value. Few if any of these businesses are wholly transparent about how much and what sort of private intelligence they’re amassing — or, indeed, exactly what they’re doing with it. But what if the web didn’t have to be like that?

Berlin-based Xayn wants to change this dynamic — starting with personalized but privacy-safe web search on smartphones.

Today it’s launching a search engine app (on Android and iOS) that offers the convenience of personalized results but without the ‘usual’ shoulder surfing. This is possible because the app runs on-device AI models that learn locally. The promise is no data is ever uploaded (though trained AI models themselves can be).

The team behind the app, which is comprised of 30% PhDs, has been working on the core privacy vs convenience problem for some six years (though the company was only founded in 2017); initially as an academic research project — going on to offer an open source framework for masked federated learning, called XayNet. The Xayn app is based on that framework.

They’ve raised some €9.5 million in early stage funding to date — with investment coming from European VC firm Earlybird; Dominik Schiener (Iota co-founder); and the Swedish authentication and payment services company, Thales AB.

Now they’re moving to commercialize their XayNet technology by applying it within a user-facing search app — aiming for what CEO and co-founder, Dr Leif-Nissen Lundbæk bills as a “Zoom”-style business model, in reference to the ubiquitous videoconferencing tool which has both free and paid users.

This means Xayn’s search is not ad-supported. That’s right; you get zero ads in search results.

Instead, the idea is for the consumer app to act as a showcase for a b2b product powered by the same core AI tech. The pitch to business/public sector customers is speedier corporate/internal search without compromising commercial data privacy.

Lundbæk argues businesses are sorely in need of better search tools to (safely) apply to their own data, saying studies have shown that search in general costs around 18% of working time globally. He also cites a study by one city authority that found staff spent 37% of their time at work searching for documents or other digital content.

“It’s a business model that Google has tried but failed to succeed,” he argues, adding: “We are solving not only a problem that normal people have but also that companies have… For them privacy is not a nice to have; it needs to be there otherwise there is no chance of using anything.”

On the consumer side there will also be some premium add-ons headed for the app — so the plan is for it to be a freemium download.

Swipe to nudge the algorithm

One key thing to note is Xayn’s newly launched web search app gives users a say in whether the content they’re seeing is useful to them (or not).

It does this via a Tinder-style swipe right (or left) mechanic that lets users nudge its personalization algorithm in the right direction — starting with a home screen populated with news content (localized by country) but also extending to the search result pages.

The news-focused homescreen is another notable feature. And it sounds like different types of homescreen feeds may be on the premium cards in future.

Another key feature of the app is the ability to toggle personalized search results on or off entirely — just tap the brain icon at the top right to switch the AI off (or back on). Results without the AI running can’t be swiped, except for bookmarking/sharing.

Elsewhere, the app includes a history page which lists searches from the past seven days (by default). The other options offered are: Today, 30 days, or all history (and a bin button to purge searches).

There’s also a ‘Collections’ feature that lets you create and access folders for bookmarks.

As you scroll through search results you can add an item to a Collection by swiping right and selecting the bookmark icon — which then opens a prompt to choose which one to add it to.

The swipe-y interface feels familiar and intuitive, if slightly laggy to load content in the TestFlight beta version TechCrunch checked out ahead of launch.

Swiping left on a piece of content opens a bright pink color-block stamped with a warning ‘x’. Keep going and you’ll send the item vanishing into the ether, presumably seeing fewer like it in future.

Whereas a swipe right affirms a piece of content is useful. This means it stays in the feed, outlined in Xayn green. (Swiping right also reveals the bookmark option and a share button.)

While there are pro-privacy/non-tracking search engines on the market already — such as US-based DuckDuckGo or France’s Qwant — Xayn argues the user experience of such rivals tends to fall short of what you get with a tracking search engine like Google, i.e. in terms of the relevance of search results and thus time spent searching.

Simply put: You probably have to spend more time ‘DDGing’ or ‘Qwanting’ to get the specific answers you need vs Googling — hence the ‘convenience cost’ associated with safeguarding your privacy when web searching.

Xayn’s contention is there’s a third, smarter way of getting to keep your ‘virtual clothes’ on when searching online. This involves implementing AI models that learn on-device and can be combined in a privacy-safe way so that results can be personalized without putting people’s data at risk.

“Privacy is the very fundament… It means that quite like other privacy solutions we track nothing. Nothing is sent to our servers; we don’t store anything of course; we don’t track anything at all. And of course we make sure that any connection that is there is basically secured and doesn’t allow for any tracking at all,” says Lundbæk, explaining the team’s AI-fuelled, decentralized/edge-computing approach.

On-device reranking

Xayn is drawing on a number of search index sources, including (but not solely) Microsoft’s Bing, per Lundbæk, who described this bit of what it’s doing as “relatively similar” to DuckDuckGo (which has its own web crawling bots).

The big difference is that it’s also applying its own reranking algorithms in order generate privacy-safe personalized search results (whereas DDG uses a contextual ads-based business model — looking at simple signals like location and keyword search to target ads without needing to profile users).

The downside to this sort of approach, according to Lundbæk, is users can get flooded with ads — as a consequence of the simpler targeting meaning the business serves more ads to try to increase chances of a click. And loads of ads in search results obviously doesn’t make for a great search experience.

“We get a lot of results on device level and we do some ad hoc indexing — so we build on the device level and on index — and with this ad hoc index we apply our search algorithms in order to filter them, and only present you what is more relevant and filter out everything else,” says Lundbæk, sketching how Xayn works. “Or basically downgrade it a bit… but we also try to keep it fresh and explore and also bump up things where they might not be super relevant for you but it gives you some guarantees that you won’t end up in some kind of bubble.”

Some of what Xayn’s doing is in the arena of federated learning (FL) — a technology Google has been dabbling in in recent years, including pushing a ‘privacy-safe’ proposal for replacing third party tracking cookies. But Xayn argues the tech giant’s interests, as a data business, simply aren’t aligned with cutting off its own access to the user data pipe (even if it were to switch to applying FL to search).

Whereas its interests — as a small, pro-privacy German startup — are markedly different. Ergo, the privacy-preserving technology it’s spent years building has a credible interest in safeguarding people’s data, is the claim.

Google wants to phase out support for third-party cookies in Chrome within two years

“At Google there’s actually [fewer] people working on federate learning than in our team,” notes Lundbæk, adding: “We’ve been criticizing TFF [Google-designed TensorFlow Federated] at lot. It is federated learning but it’s not actually doing any encryption at all — and Google has a lot of backdoors in there.

“You have to understand what does Google actually want to do with that? Google wants to replace [tracking] cookies — but especially they want to replace this kind of bumpy thing of asking for user consent. But of course they still want your data. They don’t want to give you any more privacy here; they want to actually — at the end — get your data even easier. And with purely federated learning you actually don’t have a privacy solution.

“You have to do a lot in order to make it privacy preserving. And pure TFF is certainly not that privacy-preserving. So therefore they will use this kind of tech for all the things that are basically in the way of user experience — which is, for example, cookies but I would be extremely surprised if they used it for search directly. And even if they would do that there is a lot of backdoors in their system so it’s pretty easy to actually acquire the data using TFF. So I would say it’s just a nice workaround for them.”

“Data is basically the fundamental business model of Google,” he adds. “So I’m sure that whatever they do is of course a nice step in the right direction… but I think Google is playing a clever role here of kind of moving a bit but not too much.”

So how, then, does Xayn’s reranking algorithm work?

The app runs four AI models per device, combining encrypted AI models of respective devices asynchronously — with homomorphic encryption — into a collective model. A second step entails this collective model being fed back to individual devices to personalize served content, it says. 

The four AI models running on the device are one for natural language processing; one for grouping interests; one for analyzing domain preferences; and one for computing context.

“The knowledge is kept but the data is basically always staying on your device level,” is how Lundbæk puts it.

“We can simply train a lot of different AI models on your phone and decide whether we, for example, combine some of this knowledge or whether it also stays on your device.”

“We have developed a quite complex solution of four different AI models that work in composition with each other,” he goes on, noting that they work to build up “centers of interest and centers of dislikes” per user — again, based on those swipes — which he says “have to be extremely efficient — they have to be moving, basically, also over time and with your interests”.

The more the user interacts with Xayn, the more precise its personalization engine gets as a result of on-device learning — plus the added layer of users being able to get actively involved by swiping to give like/dislike feedback.

The level of personalization is very individually focused — Lundbæk calls it “hyper personalization” — more so than a tracking search engine like Google, which he notes also compares cross-user patterns to determine which results to serve — something he says Xayn absolutely does not do.

Small data, not big data

“We have to focus entirely on one user so we have a ‘small data’ problem, rather than a big data problem,” says Lundbæk. “So we have to learn extremely fast — only from eight to 20 interactions we have to already understand a lot from you. And the crucial thing is of course if you do such a rapid learning then you have to take even more care about filter bubbles — or what is called filter bubbles. We have to prevent the engine going into some kind of biased direction.”

To avoid this echo chamber/filter bubble type effect, the Xayn team has designed the engine to function in two distinct phases which it switches between: Called ‘exploration’ and (more unfortunately) ‘exploitation’ (i.e. just in the sense that it already knows something about the user so can be pretty certain what it serves will be relevant).

“We have to keep fresh and we have to keep exploring things,” he notes — saying that’s why it developed one of the four AIs (a dynamic contextual multi-armed bandit reinforcement learning algorithm for computing context).

Aside from this app infrastructure being designed natively to protect user privacy, Xayn argues there are a bunch of other advantages — such as being able to derive potentially very clear interests signs from individuals; and avoiding the chilling effect that can result from tracking services creeping users out (to the point people they avoid making certain searches in order to prevent them from influencing future results).

“You as the user can decide whether you want the algorithm to learn — whether you want it to show more of this or less of this — by just simply swiping. So it’s extremely easy, so you can train your system very easily,” he argues.

There is potentially a slight downside to this approach, too, though — assuming the algorithm (when on) does some learning by default (i.e in the absence of any life/dislike signals from the user).

This is because it puts the burden on the user to interact (by swiping their feedback) in order to get the best search results out of Xayn. So that’s an active requirement on users, rather than the typical passive background data mining and profiling web users are used to from tech giants like Google (which is, however, horrible for their privacy).

It means there’s an ‘ongoing’ interaction cost to using the app — or at least getting the most relevant results out of it. You might not, for instance, be advised to let a bunch of organic results just scroll past if they’re really not useful but rather actively signal disinterest on each.

For the app to be the most useful it may ultimately pay to carefully weight each item and provide the AI with a utility verdict. (And in a competitive battle for online convenience every little bit of digital friction isn’t going to help.)

Asked about this specifically, Lundbæk told us: “Without swiping the AI only learns from very weak likes but not from dislikes. So the learning takes place (if you turn the AI on) but it’s very slight and does not have a big effect. These conditions are quite dynamic, so from the experience of liking something after having visited a website, patterns are learned. Also, only 1 of the 4 AI models (the domain learning one) learns from pure clicks; the others don’t.”

Xayn does seem alive to the risk of the swiping mechanic resulting in the app feeling arduous. Lundbæk says the team is looking to add “some kind of gamification aspect” in the future — to flip the mechanism from pure friction to “something fun to do”. Though it remains to be seen what they come up with on that front.

There is also inevitably a bit of lag involved in using Xayn vs Google — by merit of the former having to run on-device AI training (whereas Google merely hoovers your data into its cloud where it’s able to process it at super-speeds using dedicated compute hardware, including bespoke chipsets).

“We have been working for over a year on this and the core focus point was bringing it on the street, showing that it works — and of course it is slower than Google,” Lundbæk concedes.

“Google doesn’t need to do any of these [on-device] processes and Google has developed even its own hardware; they developed TPUs exactly for processing this kind of model,” he goes on. “If you compare this kind of hardware it’s pretty impressive that we were even able to bring [Xayn’s on-device AI processing] even on the phone. However of course it’s slower than Google.”

Lundbæk says the team is working on increasing the speed of Xayn. And anticipates further gains as it focuses more on that type of optimization — trailing a version that’s 40x faster than the current iteration.

“It won’t at the end be 40x faster because we will use this also to analyze even more content — to give you can even broader view — but it will be faster over time,” he adds.

On the accuracy of search results vs Google, he argues the latter’s ‘network effect’ competitive advantage — whereby its search reranking benefits from Google having more users — is not unassailable because of what edge AI can achieve working smartly atop ‘small data’.

Though, again, for now Google remains the search standard to beat.

“Right now we compare ourselves, mostly against Bing and DuckDuckGo and so on. Obviously there we get much better results [than compared to Google] but of course Google is the market leader and is using quite some heavy personalization,” he says, when we ask about benchmarking results vs other search engines.

“But the interesting thing is so far Google is not only using personalization but they also use kind of a network effect. PageRank is very much a network effect where the most users they have the better the results get, because they track how often people click on something and bump this also up.

“The interesting effect there is that right now, through AI technology — like for example what we use — the network effect becomes less and less important. So actually I would say that there isn’t really any network effect anymore if you really want to compete with pure AI technology. So therefore we can get almost as relevant results as Google right now and we surely can also, over time, get even better results or competing results. But we are different.”

In our (brief) tests of the beta app Xayn’s search results didn’t obviously disappoint for simple searches (and would presumably improve with use). Though, again, the slight load lag adds a modicum of friction which was instantly obvious compared to the usual search competition.

Not a deal breaker — just a reminder that performance expectations in search are no cake walk (even if you can promise a cookie-free experience).

Europe’s top court says active consent is needed for tracking cookies

An opportunity for competition?

“So far Google has so far had the advantage of a network effect — but this network effect gets less and less dominant and you see already more and more alternatives to Google popping up,” Lundbæk argues, suggesting privacy concerns are creating an opportunity for increased competition in the search space.

“It’s not anymore like Facebook or so where there’s one network where everyone has to be. And I think this is actually a nice situation because competition is always good for technical innovations and for also satisfying different customer needs.”

Of course the biggest challenge for any would-be competitor to Google search — which carves itself a marketshare in Europe in excess of 90% — is how to poach (some of) its users.

Lundbæk says the startup has no plans to splash millions on marketing at this point. Indeed, he says they want to grow usage sustainably, with the aim of evolving the product “step by step” with a “tight community” of early adopters — relying on cross-promotion from others in the pro-privacy tech space, as well as reaching out to relevant influencers.

He also reckons there’s enough mainstream media interest in the privacy topic to generate some uplift.

“I think we have such a relevant topic — especially now,” he says. “Because we want to show also not only for ourselves that you can do this for search but we think we show a real nice example that you can do this for any kind of case.

“You don’t always need the so-called ‘best’ big players from the US which are of course getting all of your data, building up profiles. And then you have these small, cute privacy-preserving solutions which don’t use any of this but then offer a bad user experience. So we want to show that this shouldn’t be the status quo anymore — and you should start to build alternatives that are really build on European values.”

And it’s certainly true EU lawmakers are big on tech sovereignty talk these days, even though European consumers mostly continue to embrace big (US) tech.

Perhaps more pertinently, regional data protection requirements are making it increasing challenging to rely on US-based services for processing data. Compliance with the GDPR data protection framework is another factor businesses need to consider. All of which is driving attention onto ‘privacy-preserving’ technologies.

Europe puts out advice on fixing international data transfers that’s cold comfort for Facebook

 

Xayn’s team is hoping to be able spread its privacy-preserving gospel to general users by growing the b2b side of the business, according to Lundbæk — so it’s hoping some home use will follow once employees get used to convenient private search via their workplaces, in a small-scale reverse of the business consumerization trend that was powered by modern smartphones (and people bringing their own device to work).

“We these kind of strategies I think we can step by step build up in our communities and spread the word — so we think we don’t even need to really spend millions of euros in marketing campaigns to get more and more users,” he adds.

While Xayn’s initial go-to-market push has been focused on getting the mobile apps out, a desktop version is also planned for Q1 next year.

The challenge there is getting the app to work as a browser extension as the team obviously doesn’t want to build its own browser to house Xayn. tl;dr: Competing with Google search is mountain enough to climb, without trying to go after Chrome (and Firefox, and so on).

“We developed our entire AI in Rust which is a safe language. We are very much driven by security here and safety. The nice thing is it can work everywhere — from embedded systems towards mobile systems, and we can compile into web assembly so it runs also as a browser extension in any kind of browser,” he adds. “Except for Internet Explorer of course.”

More TechCrunch

The person who claimed to have stolen the physical addresses of 49 million Dell customers appears to have taken more data from a different Dell portal, TechCrunch has learned. The…

Threat actor scraped Dell support tickets, including customer phone numbers

If you write the words “cis” or “cisgender” on X, you might be served this full-screen message: “This post contains language that may be considered a slur by X and…

On Elon’s whim, X now treats ‘cisgender’ as a slur

Facebook once had big ambitions to be a major player in enterprise communication and productivity, but today the social network’s parent company Meta will be closing a very significant chapter…

Meta is shutting down Workplace, its enterprise communications business

The Oversight Board has overturned Meta’s decision to take down a documentary revealing the identities of child abuse victims in Pakistan.

Meta’s Oversight Board overturns takedown decision for Pakistan child abuse documentary

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: How to watch

Adam Selipsky is stepping down from his role as CEO of Amazon Web Services, Amazon has confirmed to TechCrunch.  In a memo shared internally by Amazon CEO Andy Jassy and…

AWS CEO Adam Selipsky steps down

VC and podcaster David Sacks has revealed a new AI chat app called Glue that fixes “Slack channel fatigue,” he says.

David Sacks reveals Glue, the AI company he’s been teasing on his All In podcast

Harness isn’t founder Jyoti Bansal’s first startup. He sold AppDynamics to Cisco for $3.7 billion in 2017, the week it was supposed to go public. His latest venture has raised…

After surpassing $100M in ARR, Harness grabs a $150M line of credit

You can expect plenty of AI, but probably not a lot of hardware.

Google I/O 2024: What to expect

The company’s autonomous vehicles have had a number of misadventures lately, involving driving into construction sites.

Waymo’s robotaxis under investigation after crashes and traffic mishaps

The company is describing the event as “a chance to demo some ChatGPT and GPT-4 updates.”

OpenAI’s ChatGPT announcement: Watch the GPT-4o reveal and demo here

Sona, a workforce management platform for frontline employees, has raised $27.5 million in a Series A round of funding. More than two-thirds of the U.S. workforce are reportedly in frontline…

Sona, a frontline workforce management platform, raises $27.5M with eyes on US expansion

Uber Technologies announced Tuesday that it will buy the Taiwan unit of Delivery Hero’s Foodpanda for $950 million in cash. The deal is part of Uber Eats’ strategy to expand…

Uber to acquire Foodpanda’s Taiwan unit from Delivery Hero for $950M in cash 

Paris-based Blisce has become the latest VC firm to launch a fund dedicated to climate tech. It plans to raise as much as €150M (about $162M).

Paris-based VC firm Blisce launches climate tech fund with a target of $160M

Maad, a B2B e-commerce startup based in Senegal, has secured $3.2 million debt-equity funding to bolster its growth in the western Africa country and to explore fresh opportunities in the…

Maad raises $3.2M seed amid B2B e-commerce sector turbulence in Africa

The fresh funds were raised from two investors who transferred the capital into a special purpose vehicle, a legal entity associated with the OpenAI Startup Fund.

OpenAI Startup Fund raises additional $5M

Accel has invested in more than 200 startups in the region to date, making it one of the more prolific VCs in this market.

Accel has a fresh $650M to back European early-stage startups

Kyle Vogt, the former founder and CEO of self-driving car company Cruise, has a new VC-backed robotics startup focused on household chores. Vogt announced Monday that the new startup, called…

Cruise founder Kyle Vogt is back with a robot startup

When Keith Rabois announced he was leaving Founders Fund to return to Khosla Ventures in January, it came as a shock to many in the venture capital ecosystem — and…

From Miles Grimshaw to Eva Ho, venture capitalists continue to play musical chairs

On the heels of OpenAI announcing the latest iteration of its GPT large language model, its biggest rival in generative AI in the U.S. announced an expansion of its own.…

Anthropic is expanding to Europe and raising more money

If you’re looking for a Starliner mission recap, you’ll have to wait a little longer, because the mission has officially been delayed.

TechCrunch Space: You rock(et) my world, moms

Apple devoted a full event to iPad last Tuesday, roughly a month out from WWDC. From the invite artwork to the polarizing ad spot, Apple was clear — the event…

Apple iPad Pro M4 vs. iPad Air M2: Reviewing which is right for most

Terri Burns, a former partner at GV, is venturing into a new chapter of her career by launching her own venture firm called Type Capital. 

GV’s youngest partner has launched her own firm

The decision to go monochrome was probably a smart one, considering the candy-colored alternatives that seem to want to dazzle and comfort you.

ChatGPT’s new face is a black hole

Apple and Google announced on Monday that iPhone and Android users will start seeing alerts when it’s possible that an unknown Bluetooth device is being used to track them. The…

Apple and Google agree on standard to alert people when unknown Bluetooth devices may be tracking them

A human safety operator will be behind the wheel during this phase of testing, according to the company.

GM’s Cruise ramps up robotaxi testing in Phoenix

OpenAI announced a new flagship generative AI model on Monday that they call GPT-4o — the “o” stands for “omni,” referring to the model’s ability to handle text, speech, and…

OpenAI debuts GPT-4o ‘omni’ model now powering ChatGPT

Featured Article

The women in AI making a difference

As a part of a multi-part series, TechCrunch is highlighting women innovators — from academics to policymakers —in the field of AI.

24 hours ago
The women in AI making a difference

The expansion of Polar Semiconductor’s facility would enable the company to double its U.S. production capacity of sensor and power chips within two years.

White House proposes up to $120M to help fund Polar Semiconductor’s chip facility expansion

In 2021, Google kicked off work on Project Starline, a corporate-focused teleconferencing platform that uses 3D imaging, cameras and a custom-designed screen to let people converse with someone as if…

Google’s 3D video conferencing platform, Project Starline, is coming in 2025 with help from HP