Closed Beta · Invite-Only · Shaping With Users

pharmasearch.tools

Drug intelligence for people who need the answer — not a subscription to a bloated terminal.

// full-stack US pharma data, one REST API, one sign-up

Every FDA dataset, every international regulator we can legally redistribute, every clinical trial registry, every major open scientific database — normalized into one instance with a documented Probability-of-Approval engine and a Claude-powered AI analyst that runs real SQL on top. Every data point commercially-free and traceable to its source row.

25+Data Sources

1Unified API

0Sales Cycles

100%Open Data

Request beta access See what’s in the box →

NDA 212102FinteplaAPPROVED2020-06-25◆ NDA 204275NuzyraAPPROVED2018-10-02◆ BLA 125557KanjintiAPPROVED2019-06-13◆ NDA 218347RezdiffraAPPROVED2024-03-14◆ NDA 217030WegovyEXPANDED2024-03-08◆ NDA 215206LeqembiAPPROVED2023-07-06◆ BLA 761300AqneursaAPPROVED2024-09-24◆ NDA 212102FinteplaAPPROVED2020-06-25◆ NDA 204275NuzyraAPPROVED2018-10-02◆ BLA 125557KanjintiAPPROVED2019-06-13◆ NDA 218347RezdiffraAPPROVED2024-03-14◆ NDA 217030WegovyEXPANDED2024-03-08◆ NDA 215206LeqembiAPPROVED2023-07-06◆ BLA 761300AqneursaAPPROVED2024-09-24◆ NDA 212102FinteplaAPPROVED2020-06-25◆ NDA 204275NuzyraAPPROVED2018-10-02◆ BLA 125557KanjintiAPPROVED2019-06-13◆ NDA 218347RezdiffraAPPROVED2024-03-14◆ NDA 217030WegovyEXPANDED2024-03-08◆ NDA 215206LeqembiAPPROVED2023-07-06◆ BLA 761300AqneursaAPPROVED2024-09-24◆

§ IThe problem

Enterprise pharma tools are built for enterprises.

The terminals are priced for investment banks and mega-pharma. The onboarding takes twelve months and a committee. Every feature is a separate SKU. Meanwhile, the people who actually need this data every day — clinical researchers, biotech analysts, medical-affairs teams, diligence consultants, IR — either get priced out or end up stuck with ten browser tabs and an FDA.gov bookmark.

pharmasearch.tools is the opposite bet. One self-serve sign-up. Full-stack US pharma intelligence. An API that covers every public data point we can legally redistribute, an engine that predicts approval probability with documented backtest accuracy, and an AI agent that does the analyst-intern work in seconds. Built entirely on commercially-free data — no paid subscriptions to scrape, no licensing landmines to inherit, every row attributable to its source.

§ IIWhat’s in the box

Four capabilities. One platform.

Each one alone would be a product. Together they’re the difference between “another database” and “the answer to the question you actually have.”

CAPABILITY 01

Search pharma data like one database — because it is.

Every FDA dataset, every international regulator we can legally redistribute, every clinical trial registry, every major open scientific database — normalized into one Postgres instance with a unified REST API in front. Write one query, get patent expiration dates. Write another, get every Phase 1/2/3/4 trial ever registered for a drug. A safety analyst traces one FAERS mention back through the label, the Drugs@FDA review, the CRL history, and the sponsor’s SEC filings — in one query.

CAPABILITY 02

The Probability of Approval engine.

The flagship. Give it a drug name, indication, and phase. It returns a probability the drug will win FDA approval, backed by a documented, testable methodology. Backtested on historical cases, accuracy, calibration, and ranking performance is comparable to or better than hand-curated expert consensus — while being fully reproducible and fully traceable to source rows. Every score emits a provenance record. Click through to the row in one tap.

CAPABILITY 03

REMS data correlation that actually works.

FDA’s REMS (Risk Evaluation and Mitigation Strategies) data is scattered across PDFs, change-log CSVs, and separate ETASU requirements documents. We ingested the whole thing and structured it — live program registry, Elements To Assure Safe Use (patient agreements, provider certifications, pharmacy registrations, lab monitoring), modification history, enforcement actions, patient-burden estimates, and ML-generated REMS predictions for pipeline drugs. Encoded into PoA scoring on both sides of the ledger.

CAPABILITY 04

The AI agent — not a chatbot, a working analyst.

The AI endpoints expose a Claude-powered analyst with full read access to the database. Not a RAG system over documents — an agent that runs real SQL, joins tables across sources, reads FDA Review PDFs, summarizes FAERS adverse-event patterns, compares trial protocols, and cites its sources with row IDs. Every call goes against fresh DB state. When it says “the FDA rejected this class in 2019 per CRL,” there’s a row ID.

§ IIIThe corpus

Twenty-five authoritative sources. One schema.

Every source normalized into the same instance, queryable through one API, traceable back to its primary URL with a row ID. No paid subscriptions anywhere in this list.

FDA Drugs@FDAApplications, Summary Reviews, approval letters, labels

FDA Orange BookApproved products, patents, exclusivity data

FDA Complete Response LettersFull openFDA CRL transparency dataset

FDA Advisory CommitteeMeeting records, drug linkage, briefing docs

FDA PMRs / PMCsPost-market requirements and commitments

FDA REMS@FDAActive programs, ETASU, modification history

FDA drug recallsClass I/II/III, reason categorization

FDA NDC DirectoryEvery marketed product with NDC code

NIH DailyMedFull SPL label archive, revision history

ClinicalTrials.govProtocols, structured results, AE counts

CMS Open PaymentsPhysician-industry payments (KOL signals)

CMS Medicare Part B / DMulti-year market-size data

USPTO patentsMetadata keyed to Orange Book

EMA EPAREuropean regulatory review text

Health CanadaNotice of Compliance, recall history

IQWiG (German HTA)Added-benefit assessments

OrphanetRare-disease designations, drug-disease graph

PubMedClinical literature with drug/condition links

OpenTargetsFull drug × target × disease graph

ChEMBLStructures, bioactivities, ATC, indications

PharmGKBClinical pharmacogenomic annotations

IUPHAR / BPS PharmacologyTarget selectivity data

Wikidata pharmacologyCross-reference graph

RxNormBrand-to-generic normalization

DGIdbDrug-gene interactions

§ IVFlagship implementation

Probability of Approval — three phases, one score.

Every PoA score is the output of three documented scoring rules run in sequence, blended against historical class-failure multipliers, and emitted with a full provenance record. Here’s what runs under the hood — and what a live forecast looks like.

PHASE 01

Precedent Depth

Walks the OpenTargets knowledge graph (drug → target → indication → approved-analog cohort) and classifies evidence into five tiers. Each tier carries a tier-implied probability.

Level Asame drug, same indication, previously approved

Level Bsame target, same indication

Level Csame mechanism, same indication

Level Dsame therapeutic area

Level Enovel

PHASE 02

Cohort Outcome Rate

Computes historical success rates for the matched cohort from our ClinicalTrials.gov terminal-trial universe, blended with published Phase Transition Success Rate base rates using a documented cascade discount.

Rewards drugs whose analog cohort has a track record. Punishes drugs whose peers die in Phase 3.

PHASE 03

Benefit-Risk Framework

Scores four dimensions — severity of condition, unmet medical need, benefit, risk — across every relevant source: FAERS adverse events, boxed-warning history, EMA EPAR benefit-risk sections, FDA Review docs, Health Canada recalls, IQWiG HTA ratings.

Real sources. Real weights. No vibes.

NEGATIVE ADJUST

Class-failure multipliers

Negative-adjustment multipliers fire when historical evidence warrants: known class failures (Alzheimer’s amyloid graveyard, CETP inhibitors, ion-channel antiarrhythmics), CRLs for the same mechanism/indication, the subject drug’s own Phase 3 failure history, post-market boxed-warning additions in the analog cohort.

Bayesian Forecast HIGH

67%

80% credible interval

52% 81%

Analog Precedent

Target Validation

Safety Benchmark

Reg. Pathway

Endpoint History

Live forecast · sample prov-id 04A82

Every score is traceable

The engine emits a structured provenance record showing exactly which rule fired, which sources contributed, and which drugs in the analog cohort drove the probability. You can click through to the source row in one tap. No black box. No “trust us.”

Backtested on historical cases, accuracy, calibration, and ranking performance are comparable to or better than hand-curated expert consensus — while being fully reproducible and fully traceable to source rows.

§ VThe data integrity promise

Commercially-free. Fully attributable. Published.

We don’t ingest DrugBank’s commercial subscription content. We don’t scrape Citeline, BiomedTracker, DealForma, Evaluate Pharma, or any paid database. Every source is listed with its current license, commercial status, share-alike flags, and attribution requirements — published at /docs/DATA_LICENSING.md.

US Gov public domain

openFDA, FDA Drugs@FDA, Orange Book, REMS@FDA, ClinicalTrials.gov, NIH DailyMed, NDC Directory, CMS Open Payments, CMS Medicare, PubMed, USPTO PatentsView, RxNorm

CC0 / public-domain dedicated

OpenTargets, Wikidata, PubChem

CC-BY (attribution)

Orphanet, EMA EPAR, Health Canada

CC-BY-SA (share-alike)

ChEMBL, PharmGKB, IUPHAR / BPS Guide to Pharmacology

MIT / permissive

DGIdb

§ VIUnder the hood

Fast when the internet is on fire.

Every endpoint has a local answer. Live upstream calls only happen when someone explicitly asks for fresh data. The architecture is built to stay available when half the public data providers are not.

A fully-documented REST API with hundreds of endpoints, every one discoverable at /docs
Managed PostgreSQL backend with autoscaling compute
Cloudflare R2 for binary archives (full DailyMed SPL archive, FDA Review PDFs, REMS materials)
Standards-based auth with API keys and bearer tokens
Daily-quota middleware for sensible limits without rigid artificial restrictions
Latest Claude models for every agent operation — no cheap fallback models, no stale prompts
DB-first caching with documented TTLs for every external source — click “Refresh data” and the engine re-queries upstream and writes the fresh result back
Next.js, TypeScript, strict type-checking, polished UI, dark-mode first

§ VIIVersus the incumbents

The comparison is not subtle.

We’re not trying to displace an enterprise RFP cycle. We’re giving solo consultants, small biotechs, research labs, and teams that can’t justify a six-figure seat the same answers — with receipts.

Deployment

Self-serve signup, free trial

Sales cycle, procurement, long onboarding

API access

Every endpoint in one plan

Often an additional enterprise tier

PoA prediction

Included, traceable, documented

Not standard, or gated behind enterprise pricing

Provenance / citations

Every score → source row

Varies — often opaque

Data licensing

Full audit at /docs/DATA_LICENSING.md

Opaque — baked into total-cost figure

AI analyst

Claude-powered, full DB access

Usually none

Update cadence

Continuous ingest from FDA / EMA / CT.gov

Quarterly publishing cadence

What’s in the box

Full schema is visible

Often not

§ VIIIWhat’s shipping

Beta means live. And actively sharpening.

The platform is already powering real queries. These are the next items landing — users see each improvement as it goes live.

Deeper FDA Review coverage

Scraping more Drugs@FDA Summary Review PDFs with on-demand Benefit-Risk Framework parsing.

FDA AdCom transcript ingestion

To extract actual vote tallies — the FDA’s strongest public predictor of approval.

Full EMA EPAR coverage

Rate-limited by EMA’s servers but climbing steadily.

Publications-landscape scoring

Wiring PubMed research velocity into PoA as a positive signal for active investigation.

More therapeutic-area priors

Expanding beyond the modeled class graveyards (Alzheimer’s amyloid, CETP, ion-channel antiarrhythmics) to cover more historical class failures.

DailyMed historical archive

The current archive is ingested; the retired pre-2015 setids are the next target.

Real-time FDA update webhooks

Polling today; push-based in the roadmap.

§ IXWho this is for

People who need the answer today.

Clinical researchers

Compare your trial design against every historical analog in seconds. Pull every terminal trial in your indication in one query.

Biotech diligence analysts

Validate a deal thesis with real FDA, CMS, EMA, and clinical data. Trace every asset back to its primary row.

Medical-affairs teams

Prepare scientific responses with citable primary sources. Every response grounded in a row ID you can reference.

Investor relations & equity research

Cover sponsor pipelines with PoA scores you can defend to the committee — not proprietary analyst opinions.

Small biotech leadership

Can’t justify a six-figure enterprise seat but still needs the answers. One sign-up, full corpus, one plan.

Regulatory consultants

Track FDA action patterns by committee, reviewer, or drug class. Query the whole CRL transparency dataset in one line.

Academic researchers

A commercially-licensed source of the same data you’d otherwise pull piecemeal from a dozen APIs and a scraper.

Due-diligence consultants

Want to know which of a sponsor’s portfolio carries ETASU burden? One query. When was a REMS modified and what changed? In modification history.