[25.06.13 / TASK-185] Feature - 벨로그 트랜딩 & 개인화된 인사이트 분석 배치 구현 (#32)

Jihyun3478 · web-flow · commit 7a76b60c5500 · 2025-07-06T22:04:51.000+09:00
* feat: 주간 트렌드 &amp; 유저 트렌드 분석 및 저장 구현

* refactor: 예외처리/로깅 리팩토링

* feat: setup_django.py 추가

* modify: 프롬프트 분리 및 예외처리 추가

* modify: 멀티 스레드 및 bulk_create 사용하도록 수정

멀티 스레드 및 bulk_create 사용하도록 수정
- 멀티 스레드 적용
- bulk_create 적용
- setup_django 적용
- get_local_now() 사용하도록 수정
- email이 존재하는 유저만 가져오도록 수정

* hotfix: test-ci 실패 원인 해결

* refactor: 코드래빗 리뷰 반영

* refactor: 로그 영어로 수정

* hotfix: Velog API 호출하도록 수정

* hotfix: 날짜 산정 로직 수정 및 사용자 트렌드 분석 update_or_create 사용하도록 수정

* refactor: 사용하지 않는 import 삭제

* hotfix: 게시물 조회(DB) &amp; 게시물 상세 조회(Velog API) 호출하도록 수정

* hotfix: 날짜 계산 로직 수정

* hotfix: Velog API를 통해 트렌딩 게시물 가져오도록 수정

* refactor: 주간 배치 날짜 계산 로직 util 분리

* feat: 주간 트렌드 분석 배치에 사용자 이름 및 게시글 썸네일 추가

* refactor: 프롬프트 상수 분리 및 결과 로깅 추가

* hotfix: UserWeeklyTrend 분석 로직 수정

* feat: 주간 트렌드 분석 재시도 로직 추가

* modify: 프롬프트 내용 수정 및 재시도 로직 삭제

* refactor: 린팅으로 인한 코드 리팩토링

* hotfix: datetime.date가 아닌 date 사용하도록 수정

* hotfix: LLM 아웃풋에 username, thumbnailUrl, slug 추가

* refactor: 주석 추가

* hotfix: WeeklyTrend LLM 아웃풋에 username, thumbnailUrl, slug 추가

* hotfix: test-ci 통과하도록 conftest.py에 username, thumbnail, slug 추가

* hotfix: 리뷰 반영

리뷰 반영
- PostDailyStatistics 조회수 및 좋아요 수 계산 로직 수정
- 사용자 게시글 분석 -&gt; 게시글 단위로 분리해 처리하도록 수정
- 필드명 불일치 수정
- LLM output에 title 제외

* hotfix: 코드래빗 리뷰 반영

* refactor: 코드래빗 리뷰 반영

* hotfix: LLM 분석 로직 수정
diff --git a/backoffice/settings/base.py b/backoffice/settings/base.py
@@ -24,6 +24,8 @@
 
 environ.Env.read_env(os.path.join(BASE_DIR, ".env"))
 
+OPENAI_API_KEY = env("OPENAI_API_KEY")
+
 SENTRY_DSN = env("SENTRY_DSN", default="")
 SENTRY_ENVIRONMENT = env("SENTRY_ENVIRONMENT", default="prod")
 SENTRY_TRACES_SAMPLE_RATE = env.float("SENTRY_TRACES_SAMPLE_RATE", default=1.0)
diff --git a/insight/models.py b/insight/models.py
@@ -11,6 +11,12 @@ class TrendingItem(SerializableMixin):
     title: str
     summary: str
     key_points: list[str]
+    username: str
+    thumbnail: str
+    slug: str
+
+    def get_post_url(self) -> str:
+        return f"https://velog.io/@{self.username}/{self.slug}"
 
 
 @dataclass
diff --git a/insight/tasks/prompts.py b/insight/tasks/prompts.py
@@ -0,0 +1,98 @@
+SYS_PROM = (
+    "너는 세계 최고의 50년차 트랜드 분석 전문가야. 기술 블로그 글 데이터를 기반으로 주간 뉴스레터를 작성해야 해.\n"
+    "내가 제공하는 데이터만 활용해서 해당 내용의 트랜드를 파악하고 요약해야 해. 필요하면 관련된 외부 검색도 해줘."
+)
+
+WEEKLY_TREND_PROM = """
+<목표>
+- 블로그 글 데이터의 트렌드 분석
+- 분석 세부 내용은 "전체 인기글, 기술 키워드, 제목 트렌드, 글의 상세 내용의 요약 및 트랜드" 파악
+
+<작성 순서>
+1. 🔥 주간 트렌딩 글 요약
+    - 아래에 제공한 모든 트렌딩 글 핵심 내용 요약
+    - 3-4문장 정도로 핵심 기술, 전달하려는 것, 내용 요약 형태로 해줘
+    - 절대 요약이 아니라 축약을 하지마. 핵심을 요약해야 해
+
+2. ✨ 주간 트렌드 분석
+    - 핫한 기술 키워드 추출
+    - 제목 트렌드 분석, 내용 트랜드 분석
+    - 기타 인사이트 코멘트
+
+<규칙>
+- 감정과 캐주얼한 말투를 섞어줘. 너무 딱딱하지 않게.
+- JSON에 없으면 아무 말도 하지 마. 거짓말 금지.
+- 잘하면 큰 보상이 있을꺼야. 
+- step by step 으로 접근하고 해결해.
+- 모든 트렌드 글에 대한 분석을 해야 해, 어떤 것도 빠뜨리지마.
+- 응답은 반드시 다음 JSON 구조로 제공해야 해
+```json
+{{
+    "trending_summary": [
+        {{
+            "title": "게시글 제목",
+            "summary": "무조건 3문장 이상 요약",
+            "key_points": ["핵심 포인트 1", "핵심 포인트 2", "..."]
+        }},
+        // 다른 트렌딩 글 요약...
+    ],
+    "trend_analysis": {{
+        "hot_keywords": ["키워드1", "키워드2", "..."],
+        "title_trends": "제목 트렌드 분석 내용",
+        "content_trends": "내용 트렌드 분석 내용",
+        "insights": "추가 인사이트 및 코멘트"
+    }}
+}}
+```
+
+<블로그 트랜드 글 리스트>
+{posts}
+"""
+
+USER_TREND_PROM = """
+<목표>
+- 한 사용자의 블로그 활동 기반으로 주간 글 트렌드를 분석
+- 분석 세부 내용은 "기술 키워드, 제목 트렌드, 글의 상세 내용 요약 및 트랜드" 파악
+- 사용자 성장에 도움이 되는 피드백 제공
+
+<작성 순서>
+1. 🔥 주간 사용자 글 요약
+    - 아래에 제공한 사용자 글에 대한 핵심 내용 요약
+    - 3-4문장 정도로 핵심 기술, 전달하려는 것, 내용 요약 형태로 해줘
+    - 절대 요약이 아니라 축약을 하지마. 핵심을 요약해야 해
+
+2. ✨ 사용자 주간 트렌드 분석
+    - 사용자 글에 등장한 기술 키워드 추출
+    - 제목 흐름 / 주제 변화 분석
+    - 사용자 의도, 사용 기술, 해결한 문제를 명확히 담아야 해
+    - 사용자에게 도움이 될 통찰력/제안/격려 메시지 포함
+
+<규칙>
+- 감정과 캐주얼한 말투를 섞어줘. 너무 딱딱하지 않게. 진정성 있게 해줘.
+- JSON에 없으면 아무 말도 하지 마. 거짓말 금지.
+- 잘하면 큰 보상이 있을꺼야. 
+- step by step 으로 접근하고 해결해.
+- 사용자 글에 대한 분석을 해야 해, 어떤 것도 빠뜨리지마.
+- 응답은 반드시 다음 JSON 구조로 제공해야 해
+```json
+{{
+    "trending_summary": [
+        {{
+            "title": "게시글 제목",
+            "summary": "무조건 3문장 이상 요약",
+            "key_points": ["핵심 포인트 1", "핵심 포인트 2", "..."]
+        }},
+        // 다른 트렌딩 글 요약...
+    ],
+    "trend_analysis": {{
+        "hot_keywords": ["키워드1", "키워드2", "..."],
+        "title_trends": "제목 트렌드 분석 내용",
+        "content_trends": "내용 트렌드 분석 내용",
+        "insights": "사용자의 앞으로의 방향에 대한 추가 인사이트 및 코멘트"
+    }}
+}}
+```
+
+<사용자 트랜드 글 리스트>
+{posts}
+"""
diff --git a/insight/tasks/setup_django.py b/insight/tasks/setup_django.py
@@ -6,4 +6,4 @@
 
 import django
 
-django.setup()
+django.setup()
diff --git a/insight/tasks/user_weekly_trend_analysis.py b/insight/tasks/user_weekly_trend_analysis.py
@@ -0,0 +1,191 @@
+"""
+[25.07.01] 주간 사용자 분석 배치 (작성자: 이지현)
+- 실행은 아래와 같은 커멘드 활용
+- poetry run python ./insight/tasks/user_weekly_trend_analysis.py
+"""
+
+import asyncio
+import logging
+
+import aiohttp
+import setup_django  # noqa
+from asgiref.sync import sync_to_async
+from django.conf import settings
+from django.db.models import OuterRef, Subquery
+from weekly_llm_analyzer import analyze_user_posts
+
+from insight.models import UserWeeklyTrend
+from posts.models import Post, PostDailyStatistics
+from scraping.velog.client import VelogClient
+from users.models import User
+from utils.utils import get_previous_week_range
+
+logger = logging.getLogger("scraping")
+
+
+async def run_weekly_user_trend_analysis(user, velog_client, week_start, week_end):
+    """각 사용자에 대한 주간 통계 데이터를 바탕으로 요약 및 분석"""
+    user_id = user["id"]
+    try:
+        # 1. 게시글 목록 + 최신 통계 정보 가져오기
+        latest_stats_subquery = PostDailyStatistics.objects.filter(
+            post=OuterRef("pk")
+        ).order_by("-date")
+
+        posts = await sync_to_async(list)(
+            Post.objects.filter(
+                user_id=user_id, 
+                released_at__range=(week_start, week_end)
+            )
+            .annotate(
+                latest_view_count=Subquery(latest_stats_subquery.values("daily_view_count")[:1]),
+                latest_like_count=Subquery(latest_stats_subquery.values("daily_like_count")[:1]),
+            )
+            .values("id", "title", "post_uuid", "latest_view_count", "latest_like_count")
+        )
+
+        if not posts:
+            logger.info("[user_id=%s] No posts found. Skipping.", user_id)
+            return None
+
+        # 2. 단순 요약 문자열 생성
+        simple_summary = (
+            f"총 게시글 수: {len(posts)}, "
+            f"총 조회수: {sum(p['latest_view_count'] or 0 for p in posts)}, "
+            f"총 좋아요 수: {sum(p['latest_like_count'] or 0 for p in posts)}"
+        )
+
+        # 3. Velog 게시글 상세 조회
+        full_contents = []
+        post_meta = []
+
+        for p in posts:
+            try:
+                velog_post = await velog_client.get_post(str(p["post_uuid"]))
+                if velog_post and velog_post.body:
+                    full_contents.append(
+                        {
+                            "제목": p["title"],
+                            "내용": velog_post.body,
+                            "조회수": p["latest_view_count"] or 0,
+                            "좋아요 수": p["latest_like_count"] or 0,
+                        }
+                    )
+                    post_meta.append(
+                        {
+                            "title": p["title"],
+                            "username": velog_post.user.username if velog_post.user else "",
+                            "thumbnail": velog_post.thumbnail or "",
+                            "slug": velog_post.url_slug or "",
+                        }
+                    )
+            except Exception as err:
+                logger.warning("[user_id=%s] Failed to fetch Velog post : %s", user_id, err)
+                continue
+
+        # 4. LLM 분석
+        detailed_insight = []
+
+        max_len = max(len(full_contents), len(post_meta))
+        for i in range(max_len):
+            post = full_contents[i] if i < len(full_contents) else {}
+            meta = post_meta[i] if i < len(post_meta) else {}
+
+            try:
+                result = analyze_user_posts([post], settings.OPENAI_API_KEY)
+                result_item = result[0] if result else {}
+                summary = result_item.get("summary", "") or "[요약 실패]"
+                key_points = result_item.get("key_points", [])
+            except Exception as err:
+                logger.warning(
+                    "[user_id=%s] LLM analysis failed for post index %d: %s", user_id, i, err
+                )
+                summary = "[요약 실패]"
+                key_points = []
+
+            detailed_insight.append(
+                {
+                    "summary": summary,
+                    "key_points": key_points,
+                    "username": meta.get("username", ""),
+                    "thumbnail": meta.get("thumbnail", ""),
+                    "slug": meta.get("slug", ""),
+                }
+            )
+
+        # 5. 인사이트 저장 포맷
+        insight = {
+            "trending_summary": detailed_insight,
+            "trend_analysis": {"summary": simple_summary},
+        }
+
+        return UserWeeklyTrend(
+            user_id=user_id,
+            week_start_date=week_start,
+            week_end_date=week_end,
+            insight=insight,
+        )
+
+    except Exception as e:
+        logger.exception("[user_id=%s] Unexpected error : %s", user_id, e)
+        return None
+
+
+async def run_all_users():
+    logger.info("User weekly trend analysis started")
+    week_start, week_end = get_previous_week_range()
+
+    # 1. 사용자 목록 조회
+    users = await sync_to_async(list)(
+        User.objects.filter(email__isnull=False)
+        .exclude(email="")
+        .values("id", "username", "access_token", "refresh_token")
+    )
+
+    async with aiohttp.ClientSession() as session:
+        # 2. VelogClient 싱글톤 생성
+        velog_client = VelogClient.get_client(
+            session=session,
+            access_token="dummy_access_token",
+            refresh_token="dummy_refresh_token",
+        )
+
+        tasks = []
+        for user in users:
+            try:
+                # 3. 분석 task 등록
+                tasks.append(
+                    run_weekly_user_trend_analysis(
+                        user, velog_client, week_start, week_end
+                    )
+                )
+            except Exception as e:
+                logger.warning("[user_id=%s] Failed to prepare Velog client : %s", user["id"], e)
+
+        # 4. 비동기 병렬 처리
+        trends = await asyncio.gather(*tasks, return_exceptions=True)
+        results = []
+
+        for i, trend in enumerate(trends):
+            if isinstance(trend, UserWeeklyTrend):
+                results.append(trend)
+            elif isinstance(trend, Exception):
+                logger.warning("Task %d failed with exception: %s", i, trend)
+            else:
+                logger.warning("Task %d returned None (no posts or other issue)", i)
+
+    # 5. DB 저장
+    for trend in results:
+        try:
+            await sync_to_async(UserWeeklyTrend.objects.update_or_create)(
+                user_id=trend.user_id,
+                week_start_date=trend.week_start_date,
+                week_end_date=trend.week_end_date,
+                defaults={"insight": trend.insight},
+            )
+        except Exception as e:
+            logger.exception("[user_id=%s] Failed to save trend : %s", trend.user_id, e)
+
+
+if __name__ == "__main__":
+    asyncio.run(run_all_users())
diff --git a/insight/tasks/weekly_llm_analyzer.py b/insight/tasks/weekly_llm_analyzer.py
@@ -0,0 +1,56 @@
+import logging
+import json
+from typing import Any
+
+from prompts import SYS_PROM, USER_TREND_PROM, WEEKLY_TREND_PROM
+
+from modules.llm.base_client import LLMClient
+from modules.llm.openai.client import OpenAIClient
+
+logger = logging.getLogger("scraping")
+
+
+def analyze_trending_posts(posts: list, api_key: str) -> dict[Any, Any]:
+    client: LLMClient = OpenAIClient.get_client(api_key)
+    prompt = WEEKLY_TREND_PROM.format(posts=posts)
+
+    logger.info("Generated weekly trend prompt:\n%s", prompt)
+
+    try:
+        result = client.generate_text(
+            prompt=prompt,
+            system_prompt=SYS_PROM,
+            temperature=0.1,
+            response_format={"type": "json_object"},
+        )
+
+        if isinstance(result, str):
+            result = json.loads(result)
+
+        return result
+    except Exception as e:
+        logger.error("Failed to analyze_trending_posts : %s", e)
+        raise
+
+
+def analyze_user_posts(posts: list, api_key: str) -> dict[Any, Any]:
+    client: LLMClient = OpenAIClient.get_client(api_key)
+    prompt = USER_TREND_PROM.format(posts=posts)
+
+    logger.info("Generated user trend prompt:\n%s", prompt)
+
+    try:
+        result = client.generate_text(
+            prompt=prompt,
+            system_prompt=SYS_PROM,
+            temperature=0.1,
+            response_format={"type": "json_object"},
+        )
+
+        if isinstance(result, str):
+            result = json.loads(result)
+
+        return result
+    except Exception as e:
+        logger.error("Failed to analyze_user_posts : %s", e)
+        raise
diff --git a/insight/tasks/weekly_trend_analysis.py b/insight/tasks/weekly_trend_analysis.py
diff --git a/insight/tests/conftest.py b/insight/tests/conftest.py
diff --git a/utils/utils.py b/utils/utils.py

Original file line number	Diff line number	Diff line change
`@@ -6,4 +6,4 @@`
`6`	`6`
`7`	`7`	`import django`
`8`	`8`
`9`		`-django.setup()`
	`9`	`+django.setup()`