Taxi drivers in Yokohama spend much of their shift searching for passengers, a task that depends heavily on local knowledge and experience. This study evaluates whether an AI navigation tool that predicts high-demand routes can reduce cruising time and examines how the technology’s impact varies by driver skill level. Using data from a field trial with 520 drivers across 25,039 cruises, researchers estimated time-varying hazard models with tight controls for demand conditions and applied instrumental-variable methods to address endogenous AI activation. Results show the AI cut search time by around 5%, with benefits concentrated among lower-skilled drivers (7–8% reduction) and minimal gains for high-skilled operators, narrowing the productivity gap by 13.4%. These findings demonstrate that AI can substitute for human expertise in prediction tasks, with significant implications for workforce composition and training priorities in automatable occupations.
Case Study Source: The site name is “INFORMS PubsOnline” (publisher: INFORMS).
Problem Statement
Taxi drivers in Yokohama spend a large share of working time cruising for passengers. The study examined whether an AI tool that predicts high-demand routes can cut this search time and how its benefits differ by driver skill.
Goal
Quantify the causal effect of the AI navigation tool on drivers’ productivity (time to find a passenger) and determine whether AI complements or substitutes driver skill by comparing impacts across skill levels.
Challenges
Selective use of the AI aligned with tougher conditions: average cruising time when AI was on was **16.20** minutes (median **12.20**) versus **11.43** minutes (median **7.75**) when off.
Endogeneity in the timing of AI activation, requiring tight controls for underlying demand at the ward and date–hour level.
Low and uneven adoption during a short trial: only **198** of **520** drivers used the tool at least once; AI was on in **12.6%** of **25,039** cruises.
Limited driver background data (no age, gender, tenure), constraining observable controls.
Need to model within-cruise, minute-level switching and to construct credible pre-period skill and demand indices.
Actions
Estimated time-varying Weibull hazard models with driver, ward and date–hour fixed effects to compare on/off AI usage within the same driver and demand conditions.
Built pre-period indices: a driver skill index (from historical productivity) and a demand index (ward × date–hour) to study heterogeneity and control for local demand.
Applied an instrumental-variable, control-function approach using drivers’ unfamiliarity with drop-off locations (past ward × time-of-day visit frequencies via a Tobit first stage) to address endogeneity in AI activation.
Used propensity-score trimming (0.1–0.9) to ensure overlap and ran robustness checks, including a Cox proportional-hazards model and a placebo timing test.
Analysed adherence to AI-suggested routes and split results by the first vs. second two weeks to assess immediate effects and learning.
Key Results
Impact
Provides direct evidence that AI can substitute for human skill in prediction-heavy tasks, with gains accruing mainly to lower-skilled workers.
By narrowing productivity differences by 13.4%, the technology points to reduced relative demand for high-skilled drivers and potential compression of wage or employment gaps.
Informs workforce strategy: hire more novices for AI-supported tasks and focus training on non-automatable capabilities such as social and customer-facing skills.
The Challenge
Taxi drivers in Yokohama were spending much of their shift simply looking for passengers. Researchers wanted to find out whether a predictive AI system could help cut down empty cruising time. More importantly, they asked: does this technology level the playing field between experienced and novice drivers, or does it amplify existing skill gaps?
The trial faced several hurdles. Uptake was modest—fewer than 200 out of 520 drivers tried the tool, and it was active in only 12.6% of journeys. When cabbies did switch it on, conditions were typically harder: median search time with AI running stood at 12.20 minutes, compared to 7.75 minutes when it was off. That selective usage made it difficult to isolate the tool’s true effect. The research team also lacked basic driver data such as age or tenure, and had to account for minute-by-minute switching during individual shifts.
The Approach
To tease out causality, the researchers used sophisticated statistical controls. They compared the same driver’s performance with and without AI, holding constant location and time of day. A skill index was built from each driver’s historical record, and a demand index tracked local conditions hour by hour.
Because drivers tended to activate AI in tricky situations, a standard comparison would be misleading. The team tackled this by using an instrumental-variable method: drivers unfamiliar with a drop-off area were more likely to turn on the tool, yet that unfamiliarity didn’t directly affect cruising success. Additional checks—including propensity-score trimming and placebo tests—confirmed the findings were robust. The analysis also looked at whether drivers followed AI recommendations and tracked changes across the trial’s first and second fortnights.
What the Data Revealed
Cruising time fell modestly
Switching on the AI cut empty search time by 5.3% on average. That figure stayed consistent across different analytical methods, ranging from 4.8% to 6.5%.
Lower-skilled drivers gained most
The benefits were far from equal. Less experienced drivers saw their search time drop by roughly 7%, with the least skilled third enjoying an 8% improvement. Meanwhile, top performers saw virtually no change. In effect, the AI acted as a substitute for hard-won local knowledge.
Productivity spread narrowed
The performance gap between the best and worst drivers shrank by 13.4%. This compression suggests the tool is redistributing efficiency gains downwards rather than magnifying existing advantages.
Fare quality held steady
There was no sign that drivers were chasing quicker, cheaper rides when using AI. Average fares remained stable, ruling out one potential unintended consequence.
Benefits plateaued early
All the improvement came within the first two weeks; no further learning occurred thereafter. Compliance with AI suggestions hovered around 55% for less skilled drivers and 53.5% for their more experienced peers—remarkably similar rates.
Broader Implications
This study offers rare real-world evidence that AI can directly replace human expertise in tasks that rely heavily on prediction. The gains flowed almost entirely to those with less skill, narrowing inequality within the workforce.
That compression has workforce implications. Firms might recruit more novices for AI-augmented roles, knowing the technology compensates for lack of experience. At the same time, it underscores the importance of developing skills that machines can’t easily replicate—particularly interpersonal abilities and customer service. As predictive tools become ubiquitous, the premium on social and adaptive skills is likely to rise.
Case Study Source: The site name is “INFORMS PubsOnline” (publisher: INFORMS).
These industry AI case studies featured on our site are based on publicly available sources and are presented for informational and educational purposes only; we do not claim ownership of these case studies or affiliation with the companies mentioned, and attribution is provided where applicable.
