# AI智能面试平台完整架构设计 ## 1. 系统架构概述 ### 1.1 整体架构模式 **推荐架构:微服务架构 + 事件驱动架构** **选择理由:** - **可扩展性**:不同模块(简历处理、AI面试、分析报告)可独立扩展 - **技术多样性**:AI模块可使用Python,业务逻辑可使用Java/Node.js - **容错性**:单个服务故障不影响整体系统 - **团队协作**:不同团队可并行开发不同服务 - **部署灵活性**:支持灰度发布和快速回滚 **架构层次:** ``` ┌─────────────────────────────────────────────────────────────┐ │ API Gateway (Kong/Nginx) │ ├─────────────────────────────────────────────────────────────┤ │ 用户服务 │ 简历服务 │ 面试服务 │ 分析服务 │ 企业服务 │ ├─────────────────────────────────────────────────────────────┤ │ 消息队列 (Apache Kafka) │ ├─────────────────────────────────────────────────────────────┤ │ PostgreSQL │ MongoDB │ Redis │ Elasticsearch │ ├─────────────────────────────────────────────────────────────┤ │ 基础设施层 (Kubernetes + Docker) │ └─────────────────────────────────────────────────────────────┘ ``` ### 1.2 关键组件与交互图 **核心微服务:** 1. **用户认证服务 (Auth Service)** - 用户注册、登录、权限管理 - JWT Token生成与验证 - OAuth2.0第三方登录集成 2. **简历处理服务 (Resume Service)** - 简历文件上传与存储 - AI简历解析与结构化 - 简历搜索与匹配 3. **面试服务 (Interview Service)** - AI模拟面试流程管理 - 实时音视频处理 - 面试问题生成与管理 4. **分析服务 (Analysis Service)** - 面试音视频分析 - 评估报告生成 - 数据统计与洞察 5. **企业管理服务 (Company Service)** - 职位发布与管理 - 候选人管理 - 招聘流程管理 6. **通知服务 (Notification Service)** - 邮件、短信通知 - 实时消息推送 - 消息模板管理 **服务交互流程:** ``` 用户登录 → 认证服务 → 获取Token 简历上传 → 简历服务 → 消息队列 → AI解析服务 开始面试 → 面试服务 → AI服务 → 实时交互 面试结束 → 分析服务 → 生成报告 → 通知服务 ``` ### 1.3 系统边界与外部接口 **外部服务集成:** - **云存储**:阿里云OSS/AWS S3(文件存储) - **AI服务**:讯飞星火/OpenAI API(自然语言处理) - **音视频**:阿里云RTC/腾讯云TRTC(实时通信) - **邮件服务**:阿里云邮件推送/SendGrid - **短信服务**:阿里云短信服务/Twilio - **第三方登录**:微信、钉钉、LinkedIn OAuth - **支付服务**:支付宝、微信支付(企业版付费功能) ## 2. 前端架构 ### 2.1 推荐框架与库 **主框架:Vue 3 + TypeScript** **选择理由:** - **学习曲线平缓**:相比React更容易上手 - **性能优秀**:Composition API提供更好的逻辑复用 - **生态完善**:Vue生态系统成熟,插件丰富 - **TypeScript支持**:原生支持,类型安全 - **渐进式**:可以逐步引入,适合团队技能水平 **技术栈:** ``` Vue 3.4+ + TypeScript 5.0+ Vite 5.0+ (构建工具) Vue Router 4.0+ (路由管理) Pinia 2.0+ (状态管理) Element Plus (UI组件库) Tailwind CSS (样式框架) ``` ### 2.2 状态管理策略 **推荐:Pinia** **优势:** - **TypeScript友好**:完全支持类型推断 - **模块化**:天然支持模块化状态管理 - **DevTools支持**:优秀的调试体验 - **轻量级**:相比Vuex更简洁 **状态结构设计:** ```typescript // stores/auth.ts - 用户认证状态 export const useAuthStore = defineStore('auth', { state: () => ({ user: null as User | null, token: localStorage.getItem('token'), permissions: [] as string[] }) }) // stores/interview.ts - 面试状态 export const useInterviewStore = defineStore('interview', { state: () => ({ currentInterview: null as Interview | null, questions: [] as Question[], answers: [] as Answer[], status: 'idle' as InterviewStatus }) }) // stores/resume.ts - 简历状态 export const useResumeStore = defineStore('resume', { state: () => ({ resumes: [] as Resume[], currentResume: null as Resume | null, uploadProgress: 0 }) }) ``` ### 2.3 UI组件库建议 **主要选择:Element Plus** **理由:** - **Vue 3原生支持**:专为Vue 3设计 - **组件丰富**:覆盖大部分业务场景 - **定制性强**:支持主题定制 - **文档完善**:中文文档,易于使用 - **企业级**:适合B端应用 **补充方案:** - **Tailwind CSS**:原子化CSS,快速样式开发 - **Headless UI**:无样式组件,最大化定制 - **自定义组件**:特殊业务组件(如AI面试界面) ### 2.4 模块组织与性能优化 **项目结构:** ``` src/ ├── components/ # 通用组件 │ ├── common/ # 基础组件 │ ├── business/ # 业务组件 │ └── layout/ # 布局组件 ├── views/ # 页面组件 │ ├── auth/ # 认证相关页面 │ ├── interview/ # 面试相关页面 │ ├── resume/ # 简历相关页面 │ └── company/ # 企业管理页面 ├── stores/ # Pinia状态管理 ├── composables/ # 组合式函数 ├── utils/ # 工具函数 ├── api/ # API接口 ├── types/ # TypeScript类型定义 └── assets/ # 静态资源 ``` **性能优化措施:** 1. **代码分割与懒加载** ```typescript // 路由级别懒加载 const Interview = () => import('@/views/interview/InterviewView.vue') const Resume = () => import('@/views/resume/ResumeView.vue') // 组件级别懒加载 const HeavyComponent = defineAsyncComponent(() => import('@/components/HeavyComponent.vue') ) ``` 2. **资源优化** - **图片懒加载**:使用Intersection Observer - **CDN加速**:静态资源使用CDN - **Gzip压缩**:服务器端启用压缩 - **缓存策略**:合理设置缓存头 3. **渲染优化** - **虚拟滚动**:大列表使用虚拟滚动 - **防抖节流**:搜索、输入事件优化 - **Keep-alive**:缓存不活跃组件 ## 3. 后端架构 ### 3.1 推荐技术栈 **主要技术栈:Java + Spring Boot** **选择理由:** - **企业级成熟**:Spring生态系统完善 - **性能优秀**:JVM优化,高并发处理能力强 - **社区活跃**:丰富的第三方库和解决方案 - **团队技能**:Java开发人员相对容易招聘 - **微服务支持**:Spring Cloud提供完整微服务解决方案 **技术组合:** ``` Java 17 + Spring Boot 3.2+ Spring Cloud 2023.0+ (微服务框架) Spring Security 6.0+ (安全框架) Spring Data JPA (数据访问) MyBatis Plus (ORM增强) Redis (缓存) Apache Kafka (消息队列) Docker + Kubernetes (容器化) ``` **备选方案:Node.js + NestJS** - **适用场景**:团队JavaScript技能强,需要快速开发 - **优势**:前后端技术栈统一,开发效率高 - **劣势**:大规模并发处理相对较弱 ### 3.2 API设计原则 **推荐:RESTful API + GraphQL混合** **RESTful API用于:** - 标准CRUD操作 - 文件上传下载 - 认证授权 **GraphQL用于:** - 复杂数据查询 - 前端灵活数据获取 - 减少网络请求次数 **API设计规范:** ``` # RESTful API示例 GET /api/v1/users # 获取用户列表 POST /api/v1/users # 创建用户 GET /api/v1/users/{id} # 获取特定用户 PUT /api/v1/users/{id} # 更新用户 DELETE /api/v1/users/{id} # 删除用户 # 业务API示例 POST /api/v1/resumes/upload # 简历上传 POST /api/v1/interviews/start # 开始面试 GET /api/v1/interviews/{id}/analysis # 获取面试分析 ``` **API响应格式标准化:** ```json { "code": 200, "message": "success", "data": { // 实际数据 }, "timestamp": "2024-01-15T10:30:00Z", "traceId": "abc123def456" } ``` ### 3.3 身份验证与授权 **推荐方案:JWT + OAuth 2.0** **JWT Token设计:** ```json { "sub": "user123", "name": "张三", "role": "candidate", "permissions": ["resume:read", "interview:create"], "exp": 1642234567, "iat": 1642148167 } ``` **权限控制模型:** ``` 用户(User) → 角色(Role) → 权限(Permission) 角色定义: - CANDIDATE: 求职者 - HR: 人力资源 - ADMIN: 系统管理员 - COMPANY_ADMIN: 企业管理员 权限示例: - resume:read, resume:write - interview:create, interview:manage - company:manage, user:manage ``` **安全实现:** ```java @RestController @RequestMapping("/api/v1/interviews") @PreAuthorize("hasRole('CANDIDATE') or hasRole('HR')") public class InterviewController { @PostMapping("/start") @PreAuthorize("hasPermission('interview', 'create')") public ResponseEntity startInterview(@RequestBody StartInterviewRequest request) { // 面试开始逻辑 } } ``` ### 3.4 业务逻辑组织 **推荐:领域驱动设计(DDD) + 分层架构** **分层结构:** ``` ┌─────────────────────────────────────┐ │ Presentation Layer │ # Controller, DTO ├─────────────────────────────────────┤ │ Application Layer │ # Service, Use Cases ├─────────────────────────────────────┤ │ Domain Layer │ # Entity, Domain Service ├─────────────────────────────────────┤ │ Infrastructure Layer │ # Repository, External APIs └─────────────────────────────────────┘ ``` **领域模型示例:** ```java // 面试领域 @Entity public class Interview { private InterviewId id; private CandidateId candidateId; private JobId jobId; private InterviewStatus status; private List questions; private List answers; // 领域方法 public void start() { if (this.status != InterviewStatus.SCHEDULED) { throw new IllegalStateException("Interview cannot be started"); } this.status = InterviewStatus.IN_PROGRESS; // 发布领域事件 DomainEvents.publish(new InterviewStartedEvent(this.id)); } } // 简历领域 @Entity public class Resume { private ResumeId id; private CandidateId candidateId; private PersonalInfo personalInfo; private List workExperiences; private List skills; public MatchScore calculateMatchScore(JobRequirement requirement) { // 简历匹配算法 } } ``` ### 3.5 异步任务处理 **推荐:Apache Kafka + Spring Cloud Stream** **消息队列设计:** ``` Topic设计: - resume-uploaded: 简历上传事件 - resume-parsed: 简历解析完成事件 - interview-started: 面试开始事件 - interview-completed: 面试完成事件 - analysis-requested: 分析请求事件 - notification-requested: 通知请求事件 ``` **异步处理示例:** ```java // 简历上传后异步解析 @EventListener public class ResumeEventHandler { @KafkaListener(topics = "resume-uploaded") public void handleResumeUploaded(ResumeUploadedEvent event) { // 异步调用AI解析服务 aiParsingService.parseResumeAsync(event.getResumeId()); } @KafkaListener(topics = "interview-completed") public void handleInterviewCompleted(InterviewCompletedEvent event) { // 异步生成分析报告 analysisService.generateReportAsync(event.getInterviewId()); // 发送通知 notificationService.sendInterviewCompletionNotification(event); } } ``` **任务队列优先级设计:** ``` 高优先级:用户认证、面试实时交互 中优先级:简历解析、报告生成 低优先级:数据统计、日志处理 ``` ## 4. 深度数据架构 ### 4.1 数据库类型选择 **混合数据库架构** **关系型数据库:PostgreSQL 15+** - **用途**:用户信息、企业信息、职位信息、面试记录 - **优势**:ACID特性、复杂查询、数据一致性 - **选择理由**:相比MySQL,PostgreSQL在JSON支持、全文搜索、扩展性方面更优秀 **文档数据库:MongoDB 7.0+** - **用途**:简历结构化数据、面试分析结果、非结构化日志 - **优势**:灵活schema、水平扩展、JSON原生支持 - **选择理由**:简历数据结构多样,MongoDB更适合存储和查询 **搜索引擎:Elasticsearch 8.0+** - **用途**:简历全文搜索、职位匹配、日志分析 - **优势**:强大的全文搜索、实时分析、分布式架构 **缓存数据库:Redis 7.0+** - **用途**:会话存储、热点数据缓存、分布式锁 - **优势**:高性能、丰富数据结构、持久化支持 **CAP理论权衡:** - **用户核心数据**:选择CP(一致性+分区容错),使用PostgreSQL - **搜索和分析**:选择AP(可用性+分区容错),使用Elasticsearch - **缓存数据**:选择AP,使用Redis集群 ### 4.2 详细数据模型设计 **PostgreSQL核心实体设计:** ```sql -- 用户表 CREATE TABLE users ( id BIGSERIAL PRIMARY KEY, uuid UUID UNIQUE NOT NULL DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE NOT NULL, password_hash VARCHAR(255) NOT NULL, user_type VARCHAR(20) NOT NULL CHECK (user_type IN ('candidate', 'hr', 'admin')), profile JSONB, created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, deleted_at TIMESTAMP WITH TIME ZONE ); -- 企业表 CREATE TABLE companies ( id BIGSERIAL PRIMARY KEY, uuid UUID UNIQUE NOT NULL DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL, industry VARCHAR(100), size_range VARCHAR(50), description TEXT, logo_url VARCHAR(500), website VARCHAR(255), created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); -- 职位表 CREATE TABLE jobs ( id BIGSERIAL PRIMARY KEY, uuid UUID UNIQUE NOT NULL DEFAULT gen_random_uuid(), company_id BIGINT NOT NULL REFERENCES companies(id), title VARCHAR(255) NOT NULL, description TEXT, requirements JSONB, -- 技能要求、经验要求等 salary_range JSONB, -- {"min": 10000, "max": 20000, "currency": "CNY"} location VARCHAR(255), job_type VARCHAR(50), -- full-time, part-time, contract status VARCHAR(20) DEFAULT 'active', created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); -- 简历表(基础信息) CREATE TABLE resumes ( id BIGSERIAL PRIMARY KEY, uuid UUID UNIQUE NOT NULL DEFAULT gen_random_uuid(), candidate_id BIGINT NOT NULL REFERENCES users(id), original_filename VARCHAR(255), file_url VARCHAR(500), file_type VARCHAR(20), file_size BIGINT, parsing_status VARCHAR(20) DEFAULT 'pending', -- pending, processing, completed, failed parsed_data_id VARCHAR(100), -- MongoDB文档ID created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); -- 面试表 CREATE TABLE interviews ( id BIGSERIAL PRIMARY KEY, uuid UUID UNIQUE NOT NULL DEFAULT gen_random_uuid(), job_id BIGINT NOT NULL REFERENCES jobs(id), candidate_id BIGINT NOT NULL REFERENCES users(id), resume_id BIGINT REFERENCES resumes(id), status VARCHAR(20) DEFAULT 'scheduled', -- scheduled, in_progress, completed, cancelled scheduled_at TIMESTAMP WITH TIME ZONE, started_at TIMESTAMP WITH TIME ZONE, completed_at TIMESTAMP WITH TIME ZONE, duration_seconds INTEGER, video_url VARCHAR(500), audio_url VARCHAR(500), transcript_data_id VARCHAR(100), -- MongoDB文档ID analysis_data_id VARCHAR(100), -- MongoDB文档ID created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); -- 面试问题表 CREATE TABLE interview_questions ( id BIGSERIAL PRIMARY KEY, interview_id BIGINT NOT NULL REFERENCES interviews(id), question_text TEXT NOT NULL, question_type VARCHAR(50), -- behavioral, technical, situational asked_at TIMESTAMP WITH TIME ZONE, answer_text TEXT, answer_duration_seconds INTEGER, ai_score DECIMAL(3,2), -- 0.00-1.00 created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); ``` **MongoDB文档结构设计:** ```javascript // 简历解析数据集合 db.parsed_resumes.insertOne({ _id: ObjectId(), resume_id: "uuid-from-postgresql", personal_info: { name: "张三", email: "zhangsan@example.com", phone: "+86-13800138000", location: "北京市朝阳区", birth_date: "1990-01-01", gender: "male" }, work_experiences: [ { company: "阿里巴巴", position: "高级Java开发工程师", start_date: "2020-03-01", end_date: "2023-12-31", description: "负责电商平台后端开发...", skills_used: ["Java", "Spring Boot", "MySQL", "Redis"] } ], education: [ { school: "清华大学", degree: "本科", major: "计算机科学与技术", start_date: "2016-09-01", end_date: "2020-06-30", gpa: 3.8 } ], skills: [ { category: "编程语言", items: [ {"name": "Java", "level": "expert", "years": 5}, {"name": "Python", "level": "intermediate", "years": 2} ] } ], projects: [ { name: "电商推荐系统", description: "基于机器学习的商品推荐系统", technologies: ["Python", "TensorFlow", "Redis"], start_date: "2022-01-01", end_date: "2022-06-30" } ], parsing_metadata: { parsed_at: new Date(), parser_version: "v2.1.0", confidence_score: 0.95, extracted_keywords: ["Java", "Spring Boot", "微服务", "高并发"] } }); // 面试分析数据集合 db.interview_analysis.insertOne({ _id: ObjectId(), interview_id: "uuid-from-postgresql", overall_score: 0.78, analysis_dimensions: { technical_competency: { score: 0.82, details: { keyword_coverage: 0.85, technical_depth: 0.80, problem_solving: 0.78 } }, communication_skills: { score: 0.75, details: { fluency: 0.80, clarity: 0.72, confidence: 0.73 } }, behavioral_assessment: { score: 0.77, details: { leadership: 0.75, teamwork: 0.80, adaptability: 0.76 } } }, question_analysis: [ { question_id: "q1", question_text: "请介绍一下你的项目经验", answer_analysis: { duration_seconds: 120, word_count: 180, technical_keywords: ["微服务", "Spring Cloud", "Docker"], sentiment_score: 0.8, confidence_level: 0.75 } } ], recommendations: [ "候选人技术能力较强,建议进入下一轮面试", "沟通表达能力有待提升,可考虑提供相关培训" ], analyzed_at: new Date(), analyzer_version: "v1.5.0" }); ``` ### 4.3 规范化与反规范化 **关系型数据规范化(第三范式):** - **用户表**:避免冗余的个人信息 - **企业-职位关系**:通过外键关联,避免企业信息重复 - **面试-问题关系**:一对多关系,问题独立存储 **性能优化的反规范化策略:** 1. **冗余常用字段** ```sql -- 在面试表中冗余候选人姓名和职位标题 ALTER TABLE interviews ADD COLUMN candidate_name VARCHAR(100); ALTER TABLE interviews ADD COLUMN job_title VARCHAR(255); -- 通过触发器保持数据同步 CREATE OR REPLACE FUNCTION sync_interview_denormalized_data() RETURNS TRIGGER AS $$ BEGIN UPDATE interviews SET candidate_name = (SELECT profile->>'name' FROM users WHERE id = NEW.candidate_id), job_title = (SELECT title FROM jobs WHERE id = NEW.job_id) WHERE id = NEW.id; RETURN NEW; END; $$ LANGUAGE plpgsql; ``` 2. **预计算聚合数据** ```sql -- 企业统计表 CREATE TABLE company_statistics ( company_id BIGINT PRIMARY KEY REFERENCES companies(id), total_jobs INTEGER DEFAULT 0, active_jobs INTEGER DEFAULT 0, total_interviews INTEGER DEFAULT 0, avg_interview_score DECIMAL(3,2), last_updated TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); ``` ### 4.4 查询优化与缓存 **索引策略:** ```sql -- 复合索引用于常见查询 CREATE INDEX idx_interviews_candidate_status ON interviews(candidate_id, status); CREATE INDEX idx_jobs_company_status ON jobs(company_id, status); CREATE INDEX idx_resumes_candidate_parsing ON resumes(candidate_id, parsing_status); -- 部分索引用于特定条件 CREATE INDEX idx_active_jobs ON jobs(company_id) WHERE status = 'active'; CREATE INDEX idx_completed_interviews ON interviews(job_id, completed_at) WHERE status = 'completed'; -- JSONB字段索引 CREATE INDEX idx_user_profile_gin ON users USING GIN(profile); CREATE INDEX idx_job_requirements_gin ON jobs USING GIN(requirements); -- 全文搜索索引 CREATE INDEX idx_jobs_fulltext ON jobs USING GIN( to_tsvector('chinese', title || ' ' || description) ); ``` **查询优化示例:** ```sql -- 优化前:简历匹配查询 SELECT r.*, u.profile->>'name' as candidate_name FROM resumes r JOIN users u ON r.candidate_id = u.id WHERE r.parsing_status = 'completed' AND EXISTS ( SELECT 1 FROM parsed_resumes pr WHERE pr.resume_id = r.uuid::text AND pr.skills @> '[{"name": "Java"}]' ); -- 优化后:使用物化视图 CREATE MATERIALIZED VIEW resume_search_view AS SELECT r.id, r.uuid, r.candidate_id, u.profile->>'name' as candidate_name, pr.skills, pr.work_experiences, pr.parsing_metadata->>'extracted_keywords' as keywords FROM resumes r JOIN users u ON r.candidate_id = u.id JOIN parsed_resumes pr ON pr.resume_id = r.uuid::text WHERE r.parsing_status = 'completed'; CREATE INDEX idx_resume_search_skills ON resume_search_view USING GIN(skills); ``` **多级缓存策略:** ``` L1缓存(应用内存): - 用户会话信息(30分钟) - 常用配置数据(1小时) - 热点职位信息(15分钟) L2缓存(Redis): - 用户详细信息(2小时) - 简历解析结果(24小时) - 搜索结果(30分钟) - 面试分析报告(永久,手动失效) L3缓存(CDN): - 静态资源(图片、CSS、JS) - 公开的企业信息页面 ``` **Redis缓存设计:** ```java @Service public class CacheService { // 用户信息缓存 @Cacheable(value = "user", key = "#userId", unless = "#result == null") public User getUserById(Long userId) { return userRepository.findById(userId).orElse(null); } // 简历搜索结果缓存 @Cacheable(value = "resume_search", key = "#searchKey") public List searchResumes(String searchKey, SearchCriteria criteria) { return resumeSearchService.search(searchKey, criteria); } // 分布式锁防止缓存击穿 @RedisLock(key = "interview_analysis:#{#interviewId}", waitTime = 5, leaseTime = 30) public InterviewAnalysis getOrGenerateAnalysis(String interviewId) { // 先查缓存,没有则生成 } } ``` ### 4.5 可扩展性设计 **数据库扩展策略:** 1. **读写分离** ``` Master(写):处理所有写操作 Slave1(读):处理用户查询、简历搜索 Slave2(读):处理报表查询、数据分析 Slave3(读):处理面试相关查询 ``` 2. **垂直分库** ``` user_db:用户、认证相关表 company_db:企业、职位相关表 resume_db:简历相关表 interview_db:面试相关表 analytics_db:分析、报表相关表 ``` 3. **水平分片策略** **用户表分片(按用户ID):** ```sql -- 分片键:user_id % 16 user_shard_0: user_id % 16 = 0 user_shard_1: user_id % 16 = 1 ... user_shard_15: user_id % 16 = 15 ``` **面试表分片(按时间):** ```sql -- 按月分片 interview_2024_01: created_at >= '2024-01-01' AND created_at < '2024-02-01' interview_2024_02: created_at >= '2024-02-01' AND created_at < '2024-03-01' ``` **MongoDB分片配置:** ```javascript // 启用分片 sh.enableSharding("interview_platform") // 简历数据按候选人ID分片 sh.shardCollection( "interview_platform.parsed_resumes", { "resume_id": "hashed" } ) // 面试分析按面试ID分片 sh.shardCollection( "interview_platform.interview_analysis", { "interview_id": "hashed" } ) ``` ### 4.6 备份与恢复 **备份策略:** 1. **PostgreSQL备份** ```bash # 全量备份(每日凌晨2点) #!/bin/bash DATE=$(date +%Y%m%d) DATABASE="interview_platform" BACKUP_DIR="/backup/postgresql" pg_dump -h localhost -U postgres -d $DATABASE | \ gzip > $BACKUP_DIR/full_backup_$DATE.sql.gz # 上传到云存储 aws s3 cp $BACKUP_DIR/full_backup_$DATE.sql.gz \ s3://backup-bucket/postgresql/ # 保留30天备份 find $BACKUP_DIR -name "full_backup_*.sql.gz" -mtime +30 -delete ``` 2. **增量备份(WAL归档)** ```bash # postgresql.conf配置 wal_level = replica archive_mode = on archive_command = 'cp %p /backup/wal_archive/%f' max_wal_senders = 3 # 增量备份脚本 pg_basebackup -h localhost -D /backup/base_backup -U replicator -v -P ``` 3. **MongoDB备份** ```bash # 全量备份 mongodump --host mongodb-cluster --authenticationDatabase admin \ --username backup_user --password backup_pass \ --out /backup/mongodb/$(date +%Y%m%d) # 压缩并上传 tar -czf /backup/mongodb_$(date +%Y%m%d).tar.gz \ /backup/mongodb/$(date +%Y%m%d) aws s3 cp /backup/mongodb_$(date +%Y%m%d).tar.gz \ s3://backup-bucket/mongodb/ ``` **恢复策略:** 1. **点时间恢复(PITR)** ```bash # 恢复到指定时间点 pg_ctl stop -D /var/lib/postgresql/data rm -rf /var/lib/postgresql/data/* # 恢复基础备份 tar -xzf /backup/base_backup.tar.gz -C /var/lib/postgresql/data # 配置恢复 echo "restore_command = 'cp /backup/wal_archive/%f %p'" >> \ /var/lib/postgresql/data/recovery.conf echo "recovery_target_time = '2024-01-15 14:30:00'" >> \ /var/lib/postgresql/data/recovery.conf pg_ctl start -D /var/lib/postgresql/data ``` 2. **灾难恢复计划** **RTO(恢复时间目标):4小时** **RPO(恢复点目标):1小时** **恢复优先级:** ``` 1. 用户认证服务(15分钟内) 2. 核心业务数据库(1小时内) 3. 文件存储服务(2小时内) 4. 分析和报表服务(4小时内) ``` **异地容灾:** - **主站点**:阿里云华东1(杭州) - **备站点**:阿里云华北2(北京) - **数据同步**:实时主从复制 + 每日异地备份 ## 5. 基础设施与部署架构 ### 5.1 部署环境 **推荐云服务商:阿里云** **选择理由:** - **本土优势**:国内访问速度快,合规性好 - **产品完整**:提供完整的云原生解决方案 - **AI服务**:丰富的AI和机器学习服务 - **成本效益**:相比AWS在国内使用成本更低 - **技术支持**:中文技术支持,响应及时 **基础设施规划:** ``` 生产环境架构: ┌─────────────────────────────────────────────────────────────┐ │ CDN (阿里云CDN) │ ├─────────────────────────────────────────────────────────────┤ │ 负载均衡 (SLB) │ ├─────────────────────────────────────────────────────────────┤ │ API网关集群 │ Web服务集群 │ AI服务集群 │ │ (Kong/Nginx) │ (Spring Boot) │ (Python/FastAPI) │ ├─────────────────────────────────────────────────────────────┤ │ Kubernetes集群 (ACK) │ ├─────────────────────────────────────────────────────────────┤ │ PostgreSQL集群 │ MongoDB集群 │ Redis集群 │ Elasticsearch │ │ (RDS) │ (自建) │ (Tair) │ (自建) │ ├─────────────────────────────────────────────────────────────┤ │ 对象存储 (OSS) + 文件存储 (NAS) │ └─────────────────────────────────────────────────────────────┘ ``` **环境规划:** 1. **开发环境(DEV)** - **规模**:单节点,资源共享 - **用途**:开发人员日常开发测试 - **配置**:2核4GB ECS * 3台 2. **测试环境(TEST)** - **规模**:小规模集群 - **用途**:功能测试、集成测试 - **配置**:4核8GB ECS * 5台 3. **预生产环境(STAGING)** - **规模**:生产环境缩小版 - **用途**:性能测试、用户验收测试 - **配置**:8核16GB ECS * 8台 4. **生产环境(PROD)** - **规模**:高可用集群 - **用途**:正式对外服务 - **配置**:16核32GB ECS * 20台 ### 5.2 容器化与编排 **Docker容器化策略:** **基础镜像标准化:** ```dockerfile # Java应用基础镜像 FROM openjdk:17-jre-slim # 安装必要工具 RUN apt-get update && apt-get install -y \ curl \ wget \ telnet \ && rm -rf /var/lib/apt/lists/* # 创建应用用户 RUN groupadd -r appuser && useradd -r -g appuser appuser # 设置工作目录 WORKDIR /app # 复制应用 COPY target/*.jar app.jar # 设置权限 RUN chown -R appuser:appuser /app USER appuser # 健康检查 HEALTHCHECK --interval=30s --timeout=3s --start-period=60s \ CMD curl -f http://localhost:8080/actuator/health || exit 1 # 启动应用 ENTRYPOINT ["java", "-jar", "app.jar"] ``` **Kubernetes部署配置:** ```yaml # 用户服务部署 apiVersion: apps/v1 kind: Deployment metadata: name: user-service namespace: interview-platform spec: replicas: 3 selector: matchLabels: app: user-service template: metadata: labels: app: user-service spec: containers: - name: user-service image: registry.cn-hangzhou.aliyuncs.com/interview/user-service:v1.0.0 ports: - containerPort: 8080 env: - name: SPRING_PROFILES_ACTIVE value: "prod" - name: DB_HOST valueFrom: secretKeyRef: name: db-secret key: host resources: requests: memory: "512Mi" cpu: "250m" limits: memory: "1Gi" cpu: "500m" livenessProbe: httpGet: path: /actuator/health port: 8080 initialDelaySeconds: 60 periodSeconds: 30 readinessProbe: httpGet: path: /actuator/health/readiness port: 8080 initialDelaySeconds: 30 periodSeconds: 10 --- apiVersion: v1 kind: Service metadata: name: user-service namespace: interview-platform spec: selector: app: user-service ports: - port: 80 targetPort: 8080 type: ClusterIP ``` **Helm Chart管理:** ```yaml # Chart.yaml apiVersion: v2 name: interview-platform description: AI Interview Platform Helm Chart type: application version: 1.0.0 appVersion: "1.0.0" # values.yaml global: registry: registry.cn-hangzhou.aliyuncs.com/interview namespace: interview-platform services: userService: enabled: true replicas: 3 image: tag: v1.0.0 resources: requests: memory: 512Mi cpu: 250m limits: memory: 1Gi cpu: 500m resumeService: enabled: true replicas: 2 image: tag: v1.0.0 database: postgresql: host: rm-xxxxxxxx.mysql.rds.aliyuncs.com port: 5432 database: interview_platform redis: host: r-xxxxxxxx.redis.rds.aliyuncs.com port: 6379 ``` ### 5.3 CI/CD流程 **GitLab CI/CD Pipeline:** ```yaml # .gitlab-ci.yml stages: - test - build - security-scan - deploy-dev - deploy-test - deploy-staging - deploy-prod variables: MAVEN_OPTS: "-Dmaven.repo.local=$CI_PROJECT_DIR/.m2/repository" DOCKER_REGISTRY: "registry.cn-hangzhou.aliyuncs.com/interview" cache: paths: - .m2/repository/ - node_modules/ # 单元测试 unit-test: stage: test image: openjdk:17 script: - ./mvnw clean test - ./mvnw jacoco:report artifacts: reports: junit: - target/surefire-reports/TEST-*.xml coverage_report: coverage_format: jacoco path: target/site/jacoco/jacoco.xml coverage: '/Total.*?([0-9]{1,3})%/' # 代码质量检查 code-quality: stage: test image: sonarsource/sonar-scanner-cli:latest script: - sonar-scanner -Dsonar.projectKey=$CI_PROJECT_NAME -Dsonar.sources=src/main -Dsonar.tests=src/test -Dsonar.java.binaries=target/classes -Dsonar.coverage.jacoco.xmlReportPaths=target/site/jacoco/jacoco.xml only: - main - develop # 构建Docker镜像 build-image: stage: build image: docker:latest services: - docker:dind before_script: - docker login -u $DOCKER_USERNAME -p $DOCKER_PASSWORD $DOCKER_REGISTRY script: - ./mvnw clean package -DskipTests - docker build -t $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA . - docker push $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA - docker tag $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA $DOCKER_REGISTRY/$CI_PROJECT_NAME:latest - docker push $DOCKER_REGISTRY/$CI_PROJECT_NAME:latest # 安全扫描 security-scan: stage: security-scan image: aquasec/trivy:latest script: - trivy image --exit-code 1 --severity HIGH,CRITICAL $DOCKER_REGISTRY/$CI_PROJECT_NAME:$CI_COMMIT_SHA allow_failure: true # 部署到开发环境 deploy-dev: stage: deploy-dev image: bitnami/kubectl:latest script: - kubectl config use-context dev-cluster - helm upgrade --install $CI_PROJECT_NAME-dev ./helm-chart \ --namespace interview-platform-dev \ --set image.tag=$CI_COMMIT_SHA \ --set environment=dev environment: name: development url: https://dev.interview-platform.com only: - develop # 部署到生产环境 deploy-prod: stage: deploy-prod image: bitnami/kubectl:latest script: - kubectl config use-context prod-cluster - helm upgrade --install $CI_PROJECT_NAME ./helm-chart \ --namespace interview-platform \ --set image.tag=$CI_COMMIT_SHA \ --set environment=prod environment: name: production url: https://www.interview-platform.com when: manual only: - main ``` **部署策略:** 1. **蓝绿部署** ```yaml # 蓝绿部署脚本 apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: user-service spec: replicas: 5 strategy: blueGreen: activeService: user-service-active previewService: user-service-preview autoPromotionEnabled: false scaleDownDelaySeconds: 30 prePromotionAnalysis: templates: - templateName: success-rate args: - name: service-name value: user-service-preview postPromotionAnalysis: templates: - templateName: success-rate args: - name: service-name value: user-service-active ``` 2. **金丝雀发布** ```yaml apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: resume-service spec: replicas: 10 strategy: canary: steps: - setWeight: 10 - pause: {duration: 5m} - setWeight: 30 - pause: {duration: 10m} - setWeight: 50 - pause: {duration: 15m} - setWeight: 100 canaryService: resume-service-canary stableService: resume-service-stable ``` ### 5.4 监控与警报 **监控架构:** ``` ┌─────────────────────────────────────────────────────────────┐ │ Grafana Dashboard │ ├─────────────────────────────────────────────────────────────┤ │ Prometheus + AlertManager │ ├─────────────────────────────────────────────────────────────┤ │ Node Exporter │ App Metrics │ DB Exporter │ Custom Metrics │ ├─────────────────────────────────────────────────────────────┤ │ ELK Stack (日志聚合分析) │ ├─────────────────────────────────────────────────────────────┤ │ Jaeger (分布式链路追踪) │ └─────────────────────────────────────────────────────────────┘ ``` **Prometheus配置:** ```yaml # prometheus.yml global: scrape_interval: 15s evaluation_interval: 15s rule_files: - "alert_rules.yml" alerting: alertmanagers: - static_configs: - targets: - alertmanager:9093 scrape_configs: # Kubernetes API Server - job_name: 'kubernetes-apiservers' kubernetes_sd_configs: - role: endpoints scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name] action: keep regex: default;kubernetes;https # 应用服务监控 - job_name: 'interview-services' kubernetes_sd_configs: - role: endpoints namespaces: names: - interview-platform relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) ``` **告警规则:** ```yaml # alert_rules.yml groups: - name: interview-platform-alerts rules: # 服务可用性告警 - alert: ServiceDown expr: up{job="interview-services"} == 0 for: 1m labels: severity: critical annotations: summary: "Service {{ $labels.instance }} is down" description: "{{ $labels.instance }} has been down for more than 1 minute." # 高错误率告警 - alert: HighErrorRate expr: | ( rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) ) > 0.05 for: 5m labels: severity: warning annotations: summary: "High error rate on {{ $labels.instance }}" description: "Error rate is {{ $value | humanizePercentage }} for {{ $labels.instance }}" # 高延迟告警 - alert: HighLatency expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 0.5 for: 5m labels: severity: warning annotations: summary: "High latency on {{ $labels.instance }}" description: "95th percentile latency is {{ $value }}s for {{ $labels.instance }}" # 数据库连接告警 - alert: DatabaseConnectionHigh expr: pg_stat_activity_count > 80 for: 2m labels: severity: warning annotations: summary: "High database connections" description: "Database has {{ $value }} active connections" # 内存使用告警 - alert: HighMemoryUsage expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes > 0.85 for: 5m labels: severity: warning annotations: summary: "High memory usage on {{ $labels.instance }}" description: "Memory usage is {{ $value | humanizePercentage }}" ``` **Grafana仪表板:** ```json { "dashboard": { "title": "Interview Platform Overview", "panels": [ { "title": "Service Health", "type": "stat", "targets": [ { "expr": "up{job=\"interview-services\"}", "legendFormat": "{{ instance }}" } ] }, { "title": "Request Rate", "type": "graph", "targets": [ { "expr": "rate(http_requests_total[5m])", "legendFormat": "{{ service }}" } ] }, { "title": "Response Time", "type": "graph", "targets": [ { "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))", "legendFormat": "{{ service }}" } ] }, { "title": "Error Rate", "type": "graph", "targets": [ { "expr": "rate(http_requests_total{status=~\"5..\"}[5m]) / rate(http_requests_total[5m])", "legendFormat": "{{ service }}" } ] } ] } } ``` **日志聚合(ELK Stack):** ```yaml # logstash配置 input { beats { port => 5044 } } filter { if [fields][service] { mutate { add_field => { "service_name" => "%{[fields][service]}" } } } # 解析Java应用日志 if [service_name] =~ /.*-service/ { grok { match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[%{DATA:thread}\] %{DATA:logger} - %{GREEDYDATA:log_message}" } } date { match => [ "timestamp", "yyyy-MM-dd HH:mm:ss.SSS" ] } } } output { elasticsearch { hosts => ["elasticsearch:9200"] index => "interview-platform-%{+YYYY.MM.dd}" } } ``` ## 6. 安全架构 ### 6.1 数据安全 **传输中加密(TLS):** ```nginx # Nginx TLS配置 server { listen 443 ssl http2; server_name api.interview-platform.com; # TLS证书配置 ssl_certificate /etc/ssl/certs/interview-platform.crt; ssl_certificate_key /etc/ssl/private/interview-platform.key; # TLS安全配置 ssl_protocols TLSv1.2 TLSv1.3; ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384; ssl_prefer_server_ciphers off; ssl_session_cache shared:SSL:10m; ssl_session_timeout 10m; # HSTS add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; # 其他安全头 add_header X-Frame-Options DENY; add_header X-Content-Type-Options nosniff; add_header X-XSS-Protection "1; mode=block"; add_header Referrer-Policy "strict-origin-when-cross-origin"; location / { proxy_pass http://backend-cluster; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } } ``` **静态数据加密:** ```java // 数据库字段加密 @Entity public class User { @Id private Long id; @Column(name = "email") private String email; // 敏感字段加密存储 @Convert(converter = EncryptedStringConverter.class) @Column(name = "phone") private String phone; @Convert(converter = EncryptedStringConverter.class) @Column(name = "id_card") private String idCard; } // 加密转换器 @Component public class EncryptedStringConverter implements AttributeConverter { @Autowired private EncryptionService encryptionService; @Override public String convertToDatabaseColumn(String attribute) { return encryptionService.encrypt(attribute); } @Override public String convertToEntityAttribute(String dbData) { return encryptionService.decrypt(dbData); } } ``` **文件存储加密:** ```java // 简历文件加密存储 @Service public class ResumeStorageService { public String uploadResume(MultipartFile file, String candidateId) { try { // 生成唯一文件名 String fileName = generateSecureFileName(file.getOriginalFilename()); // 加密文件内容 byte[] encryptedContent = encryptionService.encryptFile(file.getBytes()); // 上传到OSS String objectKey = String.format("resumes/%s/%s", candidateId, fileName); ossClient.putObject(bucketName, objectKey, new ByteArrayInputStream(encryptedContent)); // 记录文件元数据 ResumeFile resumeFile = new ResumeFile(); resumeFile.setCandidateId(candidateId); resumeFile.setFileName(fileName); resumeFile.setObjectKey(objectKey); resumeFile.setEncrypted(true); resumeFileRepository.save(resumeFile); return objectKey; } catch (Exception e) { throw new StorageException("Failed to upload resume", e); } } } ``` ### 6.2 威胁防护 **API安全防护:** ```java // SQL注入防护 @Repository public class UserRepository { // 使用参数化查询 @Query("SELECT u FROM User u WHERE u.email = :email AND u.status = :status") Optional findByEmailAndStatus(@Param("email") String email, @Param("status") String status); // 避免动态SQL拼接 public List searchUsers(UserSearchCriteria criteria) { CriteriaBuilder cb = entityManager.getCriteriaBuilder(); CriteriaQuery query = cb.createQuery(User.class); Root root = query.from(User.class); List predicates = new ArrayList<>(); if (StringUtils.hasText(criteria.getName())) { predicates.add(cb.like(root.get("name"), "%" + criteria.getName() + "%")); } if (StringUtils.hasText(criteria.getEmail())) { predicates.add(cb.equal(root.get("email"), criteria.getEmail())); } query.where(predicates.toArray(new Predicate[0])); return entityManager.createQuery(query).getResultList(); } } // XSS防护 @Component public class XssFilter implements Filter { @Override public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { XssHttpServletRequestWrapper wrappedRequest = new XssHttpServletRequestWrapper( (HttpServletRequest) request); chain.doFilter(wrappedRequest, response); } } public class XssHttpServletRequestWrapper extends HttpServletRequestWrapper { public XssHttpServletRequestWrapper(HttpServletRequest request) { super(request); } @Override public String getParameter(String parameter) { String value = super.getParameter(parameter); return cleanXSS(value); } private String cleanXSS(String value) { if (value == null) { return null; } // 移除潜在的XSS攻击代码 value = value.replaceAll("]*>.*?", ""); value = value.replaceAll("javascript:", ""); value = value.replaceAll("onload\\s*=", ""); value = value.replaceAll("onclick\\s*=", ""); return value; } } // CSRF防护 @Configuration @EnableWebSecurity public class SecurityConfig { @Bean public SecurityFilterChain filterChain(HttpSecurity http) throws Exception { http .csrf(csrf -> csrf .csrfTokenRepository(CookieCsrfTokenRepository.withHttpOnlyFalse()) .ignoringRequestMatchers("/api/public/**") ) .sessionManagement(session -> session .sessionCreationPolicy(SessionCreationPolicy.STATELESS) ); return http.build(); } } ``` **API限流防护:** ```java // Redis实现的令牌桶限流 @Component public class RateLimitService { @Autowired private RedisTemplate redisTemplate; public boolean isAllowed(String key, int limit, int windowSeconds) { String script = "local key = KEYS[1]\n" + "local limit = tonumber(ARGV[1])\n" + "local window = tonumber(ARGV[2])\n" + "local current = redis.call('GET', key)\n" + "if current == false then\n" + " redis.call('SET', key, 1)\n" + " redis.call('EXPIRE', key, window)\n" + " return 1\n" + "else\n" + " if tonumber(current) < limit then\n" + " return redis.call('INCR', key)\n" + " else\n" + " return 0\n" + " end\n" + "end"; DefaultRedisScript redisScript = new DefaultRedisScript<>(); redisScript.setScriptText(script); redisScript.setResultType(Long.class); Long result = redisTemplate.execute(redisScript, Collections.singletonList(key), String.valueOf(limit), String.valueOf(windowSeconds)); return result != null && result > 0; } } // 限流注解 @Target(ElementType.METHOD) @Retention(RetentionPolicy.RUNTIME) public @interface RateLimit { int value() default 100; // 每分钟请求次数 int window() default 60; // 时间窗口(秒) String key() default ""; // 限流key } // 限流切面 @Aspect @Component public class RateLimitAspect { @Autowired private RateLimitService rateLimitService; @Around("@annotation(rateLimit)") public Object around(ProceedingJoinPoint point, RateLimit rateLimit) throws Throwable { HttpServletRequest request = ((ServletRequestAttributes) RequestContextHolder .currentRequestAttributes()).getRequest(); String key = generateKey(request, rateLimit.key()); if (!rateLimitService.isAllowed(key, rateLimit.value(), rateLimit.window())) { throw new RateLimitExceededException("Rate limit exceeded"); } return point.proceed(); } private String generateKey(HttpServletRequest request, String customKey) { if (StringUtils.hasText(customKey)) { return customKey; } String userKey = getUserIdentifier(request); String uri = request.getRequestURI(); return String.format("rate_limit:%s:%s", userKey, uri); } } ``` ### 6.3 合规性考虑 **个人信息保护法合规:** ```java // 数据脱敏服务 @Service public class DataMaskingService { // 手机号脱敏 public String maskPhone(String phone) { if (StringUtils.isEmpty(phone) || phone.length() < 7) { return phone; } return phone.substring(0, 3) + "****" + phone.substring(phone.length() - 4); } // 邮箱脱敏 public String maskEmail(String email) { if (StringUtils.isEmpty(email) || !email.contains("@")) { return email; } String[] parts = email.split("@"); String username = parts[0]; if (username.length() <= 2) { return email; } return username.substring(0, 2) + "***@" + parts[1]; } // 身份证脱敏 public String maskIdCard(String idCard) { if (StringUtils.isEmpty(idCard) || idCard.length() < 8) { return idCard; } return idCard.substring(0, 4) + "**********" + idCard.substring(idCard.length() - 4); } } // 数据访问审计 @Entity @Table(name = "data_access_logs") public class DataAccessLog { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; @Column(name = "user_id") private Long userId; @Column(name = "resource_type") private String resourceType; // resume, interview, user_profile @Column(name = "resource_id") private String resourceId; @Column(name = "action") private String action; // read, write, delete @Column(name = "ip_address") private String ipAddress; @Column(name = "user_agent") private String userAgent; @Column(name = "access_time") private LocalDateTime accessTime; @Column(name = "purpose") private String purpose; // 访问目的 } // 审计切面 @Aspect @Component public class DataAccessAuditAspect { @Autowired private DataAccessLogRepository auditRepository; @AfterReturning("@annotation(auditDataAccess)") public void auditDataAccess(JoinPoint joinPoint, AuditDataAccess auditDataAccess) { HttpServletRequest request = getCurrentRequest(); Authentication auth = SecurityContextHolder.getContext().getAuthentication(); DataAccessLog log = new DataAccessLog(); log.setUserId(getCurrentUserId(auth)); log.setResourceType(auditDataAccess.resourceType()); log.setAction(auditDataAccess.action()); log.setIpAddress(getClientIpAddress(request)); log.setUserAgent(request.getHeader("User-Agent")); log.setAccessTime(LocalDateTime.now()); log.setPurpose(auditDataAccess.purpose()); auditRepository.save(log); } } ``` **数据保留和删除策略:** ```java // 数据生命周期管理 @Service public class DataLifecycleService { // 用户数据删除(用户注销账户时) @Transactional public void deleteUserData(Long userId) { // 1. 匿名化简历数据(保留用于算法训练) anonymizeUserResumes(userId); // 2. 删除面试录音录像 deleteInterviewMediaFiles(userId); // 3. 删除个人身份信息 deletePersonalIdentifiableInfo(userId); // 4. 保留必要的业务数据(匿名化) anonymizeBusinessData(userId); // 5. 记录删除日志 logDataDeletion(userId); } // 定期清理过期数据 @Scheduled(cron = "0 0 2 * * ?") // 每天凌晨2点执行 public void cleanupExpiredData() { // 删除90天前的访问日志 dataAccessLogRepository.deleteByAccessTimeBefore( LocalDateTime.now().minusDays(90)); // 删除1年前的临时文件 cleanupTemporaryFiles(LocalDateTime.now().minusYears(1)); // 匿名化6个月前的面试数据 anonymizeOldInterviewData(LocalDateTime.now().minusMonths(6)); } } ``` ## 7. 技术栈总结 ### 7.1 技术组合一览表 | 层次 | 技术选择 | 版本 | 主要用途 | 备选方案 | |------|----------|------|----------|----------| | **前端框架** | Vue 3 + TypeScript | 3.4+ | 用户界面开发 | React 18, Angular 17 | | **状态管理** | Pinia | 2.1+ | 前端状态管理 | Vuex, Redux Toolkit | | **UI组件库** | Element Plus | 2.4+ | UI组件 | Ant Design Vue, Quasar | | **构建工具** | Vite | 5.0+ | 前端构建 | Webpack, Rollup | | **样式框架** | Tailwind CSS | 3.4+ | 样式开发 | Bootstrap, Bulma | | **后端框架** | Spring Boot | 3.2+ | 业务逻辑 | NestJS, Django, FastAPI | | **微服务** | Spring Cloud | 2023.0+ | 微服务治理 | Dubbo, gRPC | | **API网关** | Kong | 3.4+ | API管理 | Nginx, Zuul, Envoy | | **关系数据库** | PostgreSQL | 15+ | 结构化数据 | MySQL 8.0, Oracle | | **文档数据库** | MongoDB | 7.0+ | 非结构化数据 | CouchDB, Amazon DynamoDB | | **搜索引擎** | Elasticsearch | 8.0+ | 全文搜索 | Solr, Amazon OpenSearch | | **缓存** | Redis | 7.0+ | 缓存和会话 | Memcached, Hazelcast | | **消息队列** | Apache Kafka | 3.6+ | 异步消息 | RabbitMQ, Apache Pulsar | | **容器化** | Docker | 24.0+ | 应用容器化 | Podman, containerd | | **容器编排** | Kubernetes | 1.28+ | 容器编排 | Docker Swarm, Nomad | | **CI/CD** | GitLab CI | 16.0+ | 持续集成 | Jenkins, GitHub Actions | | **监控** | Prometheus + Grafana | 2.47+ / 10.0+ | 系统监控 | Zabbix, DataDog | | **日志** | ELK Stack | 8.0+ | 日志聚合 | Fluentd + InfluxDB | | **链路追踪** | Jaeger | 1.50+ | 分布式追踪 | Zipkin, SkyWalking | | **云服务商** | 阿里云 | - | 基础设施 | AWS, 腾讯云, 华为云 | ### 7.2 优缺点与备选方案 **核心技术选择分析:** #### 前端技术栈 **Vue 3 + TypeScript** - **优点**: - 学习曲线平缓,团队上手快 - Composition API提供更好的逻辑复用 - TypeScript支持优秀,类型安全 - 生态系统成熟,插件丰富 - 性能优秀,包体积小 - **缺点**: - 相比React,大型企业采用率较低 - 某些第三方库可能优先支持React - 移动端开发需要额外方案 - **备选方案**: - **React 18 + TypeScript**:适合团队有React经验,生态更丰富 - **Angular 17**:适合大型企业项目,内置功能完整 #### 后端技术栈 **Java + Spring Boot** - **优点**: - 企业级成熟度高,稳定可靠 - 性能优秀,JVM优化充分 - 微服务生态完善 - 人才储备充足 - 安全性和合规性支持好 - **缺点**: - 开发效率相对较低 - 内存占用较大 - 启动时间较长 - **备选方案**: - **Node.js + NestJS**:适合前端团队,开发效率高 - **Python + FastAPI**:适合AI算法集成,开发快速 - **Go + Gin**:适合高并发场景,性能优秀 #### 数据库选择 **PostgreSQL + MongoDB混合架构** - **优点**: - PostgreSQL:ACID特性强,复杂查询支持好 - MongoDB:灵活schema,水平扩展容易 - 各自发挥优势,互补性强 - **缺点**: - 运维复杂度增加 - 数据一致性管理困难 - 团队需要掌握多种技术 - **备选方案**: - **纯PostgreSQL**:简化架构,JSON支持较好 - **MySQL + Redis**:传统方案,生态成熟 - **云原生数据库**:如阿里云PolarDB,运维简单 #### 部署架构 **Kubernetes + Docker** - **优点**: - 云原生标准,可移植性强 - 自动扩缩容,高可用性 - 生态丰富,工具完善 - 适合微服务架构 - **缺点**: - 学习曲线陡峭 - 运维复杂度高 - 资源开销较大 - **备选方案**: - **Serverless架构**:如阿里云函数计算,运维简单 - **传统虚拟机**:技术成熟,团队熟悉 - **容器云服务**:如阿里云容器服务,降低运维复杂度 **技术选型建议:** 1. **初创团队**:选择Vue + Node.js + MongoDB + Serverless,快速迭代 2. **中型企业**:选择Vue + Java + PostgreSQL + Kubernetes,平衡性能和开发效率 3. **大型企业**:选择React + Java + 混合数据库 + 完整微服务,注重稳定性和可扩展性 **分阶段实施策略:** **第一阶段(MVP)**: - 前端:Vue 3 + Element Plus - 后端:Spring Boot单体应用 - 数据库:PostgreSQL - 部署:传统云服务器 **第二阶段(扩展)**: - 引入Redis缓存 - 添加MongoDB存储非结构化数据 - 容器化部署 - 引入CI/CD **第三阶段(微服务)**: - 拆分微服务 - 引入Kubernetes - 完善监控和日志 - 引入消息队列 **第四阶段(优化)**: - 性能优化 - 安全加固 - 多云部署 - AI能力增强 --- ## 总结 本架构设计为AI智能面试平台提供了一套完整、可扩展、安全的技术解决方案。通过微服务架构、云原生技术和现代化的开发运维体系,能够支撑从初期1000家企业到未来百万级用户的业务增长。 关键设计原则: 1. **可扩展性优先**:支持水平扩展和垂直扩展 2. **安全性保障**:全方位的安全防护和合规性支持 3. **高可用性**:多层次的容错和恢复机制 4. **开发效率**:现代化的开发工具链和自动化流程 5. **成本控制**:合理的资源配置和优化策略 该架构设计不仅满足当前业务需求,更为未来的技术演进和业务扩展预留了充分的空间。