7.6 KiB
Project overview
Your goal is to build a next.js app that allows users to upload PDF files and use OpenAI's structured output feature to extract information from the PDF file and convert it to an excel file. You will be using NextJS 14, shadcn, tailwind, and Lucid icon
Core functionality
1. File Upload & Schema Definition
- Users should be able to upload one or more PDF files
- Users should be able to define data points they want to extract:
- Individual fields (single value extractions)
- Groups (array of objects with consistent structure)
- For each group, users can define multiple fields inside, or even other groups
- There should be a button 'Start extraction'
- Set default schema to showcase how this works
- company: name of company
- address: address of company
- total sum: total amount we purchased
- items (group):
- item: name of item
- unit price: unit price of item
- quantity: quantity we purchased
- sum: total amount we purchased
- Server-side file processing
2. Text Extraction
- Use LlamaParser for PDF text extraction (server-side)
- For each file, combine all document chunks for complete text. Make sure to return full text of all documents, not just the first one documents[0]
- The llamaparser text extraction should happen immediately after user uploads files to UI, and not wait for a button click
- Strictly following ## 1. LlamaParser Documentation as code implementation example
- After each file is uploaded, it should be displayed as an item on the page, displaying the file name with a button to click to preview the full text extracted
- User can keep adding new files to the list, previously uploaded files should be displayed
- Server-side processing only
3. Data Processing
- After clicking on 'Start Extraction', the data should be sent to OpenAI for processing across all files
- Use OpenAI structured output for information extraction
- Strictly following ## 2. OpenAI Documentation as code implementation example
4. File Download
- Combine data processed from multiple PDFs into one excel file
- When there are nested structures like {'company': xxx, 'items': [{'item': xxx, 'unit price': xxx, 'quantity': xxx, 'sum': xxx}]}, it should be flattened when generating the excel file
- Implement proper error handling and type safety
- Enable excel file download
- Implement temporary file cleanup
Doc
1. LlamaParser Documentation
First, get an api key. We recommend putting your key in a file called .env that looks like this:
LLAMA_CLOUD_API_KEY=llx-xxxxxx
Set up a new TypeScript project in a new folder, we use this:
npm init npm install -D typescript @types/node
LlamaParse support is built-in to LlamaIndex for TypeScript, so you'll need to install LlamaIndex.TS:
npm install llamaindex dotenv
Let's create a parse.ts file and put our dependencies in it:
import { LlamaParseReader, // we'll add more here later } from "llamaindex"; import 'dotenv/config'
Now let's create our main function, which will load in fun facts about Canada and parse them:
async function main() { // save the file linked above as sf_budget.pdf, or change this to match const path = "./canada.pdf";
// set up the llamaparse reader const reader = new LlamaParseReader({ resultType: "markdown" });
// parse the document const documents = await reader.loadData(path);
// print the parsed document console.log(documents) }
main().catch(console.error);
Now run the file:
npx tsx parse.ts
Congratulations! You've parsed the file, and should see output that looks like this:
[ Document { id_: '02f5e252-9dca-47fa-80b2-abdd902b911a', embedding: undefined, metadata: { file_path: './canada.pdf' }, excludedEmbedMetadataKeys: [], excludedLlmMetadataKeys: [], relationships: {}, text: '# Fun Facts About Canada\n' + '\n' + 'We may be known as the Great White North, but ...etc...
2. OpenAI Documentation
Make sure to use the gpt-4o model and zod for defining data structures.
import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";
const openai = new OpenAI();
const ResearchPaperExtraction = z.object({
title: z.string(),
authors: z.array(z.string()),
abstract: z.string(),
keywords: z.array(z.string()),
});
const completion = await openai.beta.chat.completions.parse({
model: "gpt-4o-2024-08-06",
messages: [
{ role: "system", content: "You are an expert at structured data extraction. You will be given unstructured text from a research paper and should convert it into the given structure." },
{ role: "user", content: "..." },
],
response_format: zodResponseFormat(ResearchPaperExtraction, "research_paper_extraction"),
});
const research_paper = completion.choices[0].message.parsed;
Important Implementation Notes
0. Adding logs
- Always add server side logs to your code so we can debug any potential issues
1. Project setup
- All new components should go in /components at the root (not in the app folder) and be named like example-component.tsx unless otherwise specified
- All new pages go in /app
- Use the Next.js 14 app router
- All data fetching should be done in a server component and pass the data down as props
- Client components (useState, hooks, etc) require that 'use client' is set at the top of the file
2. Server-Side API Calls:
- All interactions with external APIs (e.g., Reddit, OpenAI) should be performed server-side.
- Create dedicated API routes in the
pages/api
directory for each external API interaction. - Client-side components should fetch data through these API routes, not directly from external APIs.
3. Environment Variables
- Store all sensitive information (API keys, credentials) in environment variables.
- Use a
.env.local
file for local development and ensure it's listed in.gitignore
. - For production, set environment variables in the deployment platform (e.g., Vercel).
- Access environment variables only in server-side code or API routes.
4. Error Handling and Logging
- Implement comprehensive error handling in both client-side components and server-side API routes.
- Log errors on the server-side for debugging purposes.
- Display user-friendly error messages on the client-side.
5. Type Safety
- Use TypeScript interfaces for all data structures, especially API responses.
- Avoid using
any
type; instead, define proper types for all variables and function parameters.
6. API Client Initialization
- Initialize API clients (e.g., Snoowrap for Reddit, OpenAI) in server-side code only.
- Implement checks to ensure API clients are properly initialized before use.
7. Data Fetching in Components
- Use React hooks (e.g.,
useEffect
) for data fetching in client-side components. - Implement loading states and error handling for all data fetching operations.
8. Next.js Configuration
- Utilize
next.config.mjs
for environment-specific configurations. - Use the
env
property innext.config.mjs
to make environment variables available to the application.
9. CORS and API Routes
- Use Next.js API routes to avoid CORS issues when interacting with external APIs.
- Implement proper request validation in API routes.
10. Component Structure
- Separate concerns between client and server components.
- Use server components for initial data fetching and pass data as props to client components.
11. Security
- Never expose API keys or sensitive credentials on the client-side.
- Implement proper authentication and authorization for API routes if needed.
12. Special Syntax
- When using shadcn, use npx shadcn@latest add xxx, instead of shadcn-ui@latest, this is deprecated