Virtualized PDF Rendering System¶
Overview¶
The PDF annotation system implements a sophisticated virtualization approach to handle large documents efficiently. Instead of rendering all pages at once, only visible pages (plus a small buffer) are rendered, dramatically improving performance and memory usage.
Architecture¶
Core Concept¶
The virtualization system works by: 1. Calculating heights of all pages at the current zoom level 2. Determining which pages are visible in the viewport 3. Only rendering those pages (plus overscan) 4. Ensuring selected items' pages are always rendered
Component Structure¶
PDF.tsx (Virtualization Engine)
├── Manages visible page range
├── Handles scroll events
├── Calculates page positions
└── Renders PDFPage components conditionally
└── PDFPage.tsx (Individual Page)
├── Renders PDF canvas when visible
├── Displays annotations for the page
└── Manages its own lifecycle
Implementation Details¶
Page Height Calculation¶
When the PDF loads or zoom changes:
// In PDF.tsx
useEffect(() => {
if (!pdfDoc) return;
(async () => {
const h: number[] = [];
for (let i = 1; i <= pdfDoc.numPages; i++) {
const page = await pdfDoc.getPage(i);
h.push(page.getViewport({ scale: zoomLevel }).height + 32);
}
setPageHeights(h); // Cache heights at this zoom level
})();
}, [pdfDoc, zoomLevel]);
Cumulative Heights¶
For efficient position calculations:
const cumulative = useMemo(() => {
const out: number[] = [0];
for (let i = 0; i < pageHeights.length; i++) {
out.push(out[i] + pageHeights[i]);
}
return out; // cumulative[i] = top position of page i
}, [pageHeights]);
Visible Range Detection¶
The system uses binary search for efficiency:
const calcRange = useCallback(() => {
const el = getScrollElement();
const scroll = /* current scroll position */;
const viewH = /* viewport height */;
// Binary search for first visible page
let lo = 0, hi = cumulative.length - 1;
while (lo < hi) {
const mid = Math.floor((lo + hi) / 2);
if (cumulative[mid + 1] < scroll) lo = mid + 1;
else hi = mid;
}
const first = lo;
// Find last visible page
const limit = scroll + viewH;
// ... binary search for last visible
// Add overscan for smooth scrolling
const overscan = 2;
let start = Math.max(0, first - overscan);
let end = Math.min(pageCount - 1, last + overscan);
setRange([start, end]);
}, [/* dependencies */]);
Smart Range Expansion¶
The system ensures important content is always rendered:
// Force selected annotation's page to be visible
if (selectedPageIdx !== undefined) {
start = Math.min(start, selectedPageIdx);
end = Math.max(end, selectedPageIdx);
}
// Same for search results
if (selectedSearchPageIdx !== undefined) {
start = Math.min(start, selectedSearchPageIdx);
end = Math.max(end, selectedSearchPageIdx);
}
// And chat source highlights
if (selectedChatSourcePageIdx !== undefined) {
start = Math.min(start, selectedChatSourcePageIdx);
end = Math.max(end, selectedChatSourcePageIdx);
}
Rendering Loop¶
Only pages in range are rendered:
return (
<div style={{ position: "relative" }}>
{pageInfos.map((pInfo, idx) => {
const top = cumulative[idx];
const height = pageHeights[idx];
const visible = idx >= range[0] && idx <= range[1];
return (
<div
key={pInfo.page.pageNumber}
style={{
position: "absolute",
top,
height,
width: "100%",
}}
>
{visible && (
<PDFPage
pageInfo={pInfo}
/* other props */
/>
)}
</div>
);
})}
{/* Spacer maintains correct scroll height */}
<div style={{ height: cumulative[cumulative.length - 1] }} />
</div>
);
Scroll-to-Annotation System¶
The system implements a two-phase approach for scrolling to specific items:
Phase 1: Page-Level Scroll (PDF.tsx)¶
When an annotation is selected: 1. Calculate which page contains the annotation 2. Scroll the container so the page is visible 3. Set a pending scroll ID for phase 2
useEffect(() => {
if (selectedAnnotations.length === 0 || pageHeights.length === 0) return;
if (selectedPageIdx === undefined) return;
const targetId = selectedAnnotations[0];
// Scroll to page
const topOffset = Math.max(0, cumulative[selectedPageIdx] - 32);
getScrollElement().scrollTo({ top: topOffset, behavior: "smooth" });
// Tell PDFPage to center the annotation
setPendingScrollId(targetId);
}, [selectedAnnotations, selectedPageIdx, /* ... */]);
Phase 2: Element-Level Scroll (PDFPage.tsx)¶
Once the page is rendered: 1. PDFPage checks for pending scroll requests 2. Finds the specific annotation element 3. Scrolls it into view with centering
useEffect(() => {
if (!hasPdfPageRendered) return;
if (pendingScrollId) {
const pageOwnsAnnotation = /* check if annotation is on this page */;
if (!pageOwnsAnnotation) return;
let cancelled = false;
const tryScroll = () => {
if (cancelled) return;
const el = document.querySelector(`.selection_${pendingScrollId}`);
if (el) {
el.scrollIntoView({ behavior: "smooth", block: "center" });
setPendingScrollId(null); // Clear pending
} else {
requestAnimationFrame(tryScroll); // Retry
}
};
tryScroll();
}
}, [hasPdfPageRendered, pendingScrollId, /* ... */]);
Performance Benefits¶
Memory Usage¶
- Only visible pages hold rendered canvases
- Annotations for non-visible pages aren't mounted
- Dramatic reduction for documents with 100+ pages
Rendering Performance¶
- Initial load only renders visible pages
- Scrolling only renders newly visible pages
- Zoom changes only affect rendered pages
Smooth Scrolling¶
- Overscan ensures pages are ready before visible
- Height caching prevents layout recalculations
- RequestAnimationFrame for optimal timing
Configuration¶
Overscan Amount¶
const overscan = 2; // Pages to render above/below viewport
Scroll Container¶
The system supports both window scrolling and container scrolling:
const getScrollElement = useCallback((): HTMLElement | Window => {
const el = scrollContainerRef?.current;
if (el && el.scrollHeight > el.clientHeight) return el;
return window; // Fallback to window scrolling
}, [scrollContainerRef]);
Best Practices¶
- Keep overscan reasonable - Too much defeats virtualization benefits
- Cache computations - Page heights are expensive to calculate
- Use binary search - Linear search is too slow for large documents
- Handle edge cases - Selected items must always be visible
- Debounce scroll events - Use requestAnimationFrame for smoothness
Future Enhancements¶
- Dynamic overscan - Adjust based on scroll velocity
- Progressive rendering - Low-res preview while scrolling
- Intersection Observer - More efficient visibility detection
- Memory pressure handling - Reduce overscan under memory constraints
- Predictive preloading - Anticipate scroll direction