比较提交
2 次代码提交
99f7959f5a
...
be5f7c8808
| 作者 | SHA1 | 提交日期 | |
|---|---|---|---|
|
|
be5f7c8808 | ||
|
|
5e0c64db9e |
156
similarweb-analytics/SKILL.md
普通文件
156
similarweb-analytics/SKILL.md
普通文件
@@ -0,0 +1,156 @@
|
|||||||
|
---
|
||||||
|
name: similarweb-analytics
|
||||||
|
description: Analyze website and domain traffic with SimilarWeb APIs through a Docker sandbox. Use for visits, unique visitors, rank, bounce rate, traffic sources, traffic by country, and domain comparison research.
|
||||||
|
---
|
||||||
|
|
||||||
|
# SimilarWeb Analytics
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Use this skill to run SimilarWeb analytics in an isolated Docker container and save every API response to JSON immediately.
|
||||||
|
Use it when the user asks about domain traffic, popularity ranking, engagement quality, channel mix, or country-level traffic split.
|
||||||
|
|
||||||
|
## Trigger Cues
|
||||||
|
|
||||||
|
Use this skill when the request includes one or more of these cues:
|
||||||
|
- Domain inputs such as `google.com`, `amazon.com`, `openai.com`
|
||||||
|
- Traffic words such as `visits`, `unique visitors`, `traffic trend`
|
||||||
|
- Ranking words such as `global rank`, `website rank`
|
||||||
|
- Engagement words such as `bounce rate`, `pages per visit`, `visit duration`
|
||||||
|
- Source words such as `organic`, `paid`, `direct`, `social`, `referrals`
|
||||||
|
- Geography words such as `top countries`, `country split`, `regional traffic`
|
||||||
|
- Comparison words such as `compare`, `vs`, `benchmark`
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
1. Parse user intent into API call inputs:
|
||||||
|
- `domain` (required)
|
||||||
|
- `api` (required)
|
||||||
|
- Optional: `start_date`, `end_date`, `country`, `granularity`, `limit`, `main_domain_only`
|
||||||
|
2. Build image when needed:
|
||||||
|
- Run `scripts/run_in_docker.sh --build -- --self-test`
|
||||||
|
3. Execute query in Docker sandbox:
|
||||||
|
- Run `scripts/run_in_docker.sh -- --api <api> --domain <domain> ...`
|
||||||
|
4. Persist output on every call:
|
||||||
|
- Always pass `--output /data/<file>.json` or let auto filename run in `/data`
|
||||||
|
- Never keep API output only in terminal output
|
||||||
|
5. For comparisons:
|
||||||
|
- Execute one call per domain with the same time window
|
||||||
|
- Save each domain response as a separate JSON file for reproducible analysis
|
||||||
|
|
||||||
|
## Command Entry Points
|
||||||
|
|
||||||
|
- Main host wrapper: `scripts/run_in_docker.sh`
|
||||||
|
- Container entrypoint: `scripts/docker/entrypoint.py`
|
||||||
|
- Image definition: `scripts/docker/Dockerfile`
|
||||||
|
- Runtime adapter installer: `scripts/install_runtime_adapter.sh`
|
||||||
|
- Runtime adapter source: `scripts/runtime/data_api.py`
|
||||||
|
- Test runner: `scripts/test_docker_workflow.sh`
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
Install runtime adapter to expected host path:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
/root/.codex/skills/similarweb-analytics/scripts/install_runtime_adapter.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
Build image and verify runtime:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
/root/.codex/skills/similarweb-analytics/scripts/run_in_docker.sh --build -- --self-test
|
||||||
|
```
|
||||||
|
|
||||||
|
Dry run without consuming API credits:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
/root/.codex/skills/similarweb-analytics/scripts/run_in_docker.sh -- \
|
||||||
|
--api visits-total \
|
||||||
|
--domain amazon.com \
|
||||||
|
--country world \
|
||||||
|
--dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
Real call and save data immediately:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
/root/.codex/skills/similarweb-analytics/scripts/run_in_docker.sh -- \
|
||||||
|
--api traffic-by-country \
|
||||||
|
--domain amazon.com \
|
||||||
|
--start-date 2025-12 \
|
||||||
|
--end-date 2026-02 \
|
||||||
|
--limit 10 \
|
||||||
|
--output /data/amazon-country.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Supported APIs
|
||||||
|
|
||||||
|
- `global-rank` -> `SimilarWeb/get_global_rank`
|
||||||
|
- `visits-total` -> `SimilarWeb/get_visits_total`
|
||||||
|
- `unique-visit` -> `SimilarWeb/get_unique_visit`
|
||||||
|
- `bounce-rate` -> `SimilarWeb/get_bounce_rate`
|
||||||
|
- `traffic-sources-desktop` -> `SimilarWeb/get_traffic_sources_desktop`
|
||||||
|
- `traffic-sources-mobile` -> `SimilarWeb/get_traffic_sources_mobile`
|
||||||
|
- `traffic-by-country` -> `SimilarWeb/get_total_traffic_by_country`
|
||||||
|
|
||||||
|
For parameter matrix and defaults, see `references/api-matrix.md`.
|
||||||
|
|
||||||
|
## Sandbox Rules
|
||||||
|
|
||||||
|
`scripts/run_in_docker.sh` runs with:
|
||||||
|
- Non-root container user
|
||||||
|
- Read-only root filesystem
|
||||||
|
- `tmpfs` only for `/tmp` and `/var/tmp`
|
||||||
|
- Dropped Linux capabilities (`--cap-drop ALL`)
|
||||||
|
- `no-new-privileges` enabled
|
||||||
|
- CPU, memory, and PID limits
|
||||||
|
|
||||||
|
Runtime dependency mount:
|
||||||
|
- Must mount host runtime path into container at `/opt/.manus/.sandbox-runtime`
|
||||||
|
- Default host path is `/opt/.manus/.sandbox-runtime`
|
||||||
|
- You can override with `--runtime-dir <path>`
|
||||||
|
- `data_api.py` must exist in that runtime directory
|
||||||
|
|
||||||
|
Credential pass-through:
|
||||||
|
- `SIMILARWEB_API_KEY` for official Similarweb API mode
|
||||||
|
- Optional fallback: `RAPIDAPI_KEY` and `RAPIDAPI_SIMILARWEB_HOST`
|
||||||
|
- Runner auto-forwards these env vars into container when present
|
||||||
|
|
||||||
|
## Data Constraints
|
||||||
|
|
||||||
|
- Historical data window is at most 12 months
|
||||||
|
- `traffic-by-country` is limited to at most 3 months
|
||||||
|
- Latest reliable month is the last complete month
|
||||||
|
- Default date range:
|
||||||
|
- 6 months: `global-rank`, `visits-total`, `unique-visit`, `bounce-rate`
|
||||||
|
- 3 months: `traffic-sources-desktop`, `traffic-sources-mobile`, `traffic-by-country`
|
||||||
|
|
||||||
|
## Validation Record
|
||||||
|
|
||||||
|
Last validated on `2026-03-05`:
|
||||||
|
- Docker image build succeeded
|
||||||
|
- Container self-test succeeded
|
||||||
|
- End-to-end fixture call succeeded and wrote JSON output
|
||||||
|
- Skill structure validation succeeded with `quick_validate.py`
|
||||||
|
- Runtime adapter installed to `/opt/.manus/.sandbox-runtime/data_api.py` and imported successfully
|
||||||
|
- Official mode live call attempted and failed fast with explicit credential error when `SIMILARWEB_API_KEY` is unset
|
||||||
|
- Live network call attempted via RapidAPI fallback; request reached provider and returned `403 not subscribed` (credential/subscription issue, not runtime failure)
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
- Error `data_api import failed`:
|
||||||
|
- Check that runtime path exists on host and is mounted to `/opt/.manus/.sandbox-runtime`
|
||||||
|
- Error about date range:
|
||||||
|
- Use `YYYY-MM` format and keep range inside API limits
|
||||||
|
- No output file:
|
||||||
|
- Ensure output points to `/data/...` inside container or mounted output directory from host
|
||||||
|
|
||||||
|
## Resources
|
||||||
|
|
||||||
|
- `scripts/docker/Dockerfile`: container image for sandbox runtime
|
||||||
|
- `scripts/docker/entrypoint.py`: SimilarWeb API caller inside container
|
||||||
|
- `scripts/run_in_docker.sh`: host wrapper for build and secure execution
|
||||||
|
- `scripts/install_runtime_adapter.sh`: install runtime adapter into `/opt/.manus/.sandbox-runtime`
|
||||||
|
- `scripts/runtime/data_api.py`: `ApiClient` adapter implementation
|
||||||
|
- `scripts/test_docker_workflow.sh`: reproducible smoke test script
|
||||||
|
- `references/api-matrix.md`: endpoint and parameter matrix
|
||||||
@@ -0,0 +1,4 @@
|
|||||||
|
interface:
|
||||||
|
display_name: "SimilarWeb Analytics"
|
||||||
|
short_description: "Analyze domains with SimilarWeb in a Docker sandbox"
|
||||||
|
default_prompt: "Analyze traffic, rank, sources, and geography for a domain using Dockerized SimilarWeb workflow."
|
||||||
@@ -0,0 +1,54 @@
|
|||||||
|
# SimilarWeb API Matrix
|
||||||
|
|
||||||
|
## Endpoint Mapping
|
||||||
|
|
||||||
|
| CLI `--api` value | API name | Default window |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `global-rank` | `SimilarWeb/get_global_rank` | 6 months |
|
||||||
|
| `visits-total` | `SimilarWeb/get_visits_total` | 6 months |
|
||||||
|
| `unique-visit` | `SimilarWeb/get_unique_visit` | 6 months |
|
||||||
|
| `bounce-rate` | `SimilarWeb/get_bounce_rate` | 6 months |
|
||||||
|
| `traffic-sources-desktop` | `SimilarWeb/get_traffic_sources_desktop` | 3 months |
|
||||||
|
| `traffic-sources-mobile` | `SimilarWeb/get_traffic_sources_mobile` | 3 months |
|
||||||
|
| `traffic-by-country` | `SimilarWeb/get_total_traffic_by_country` | 3 months |
|
||||||
|
|
||||||
|
## Parameters
|
||||||
|
|
||||||
|
Required:
|
||||||
|
- `domain`
|
||||||
|
- `api`
|
||||||
|
|
||||||
|
Optional shared parameters:
|
||||||
|
- `start_date` (`YYYY-MM`)
|
||||||
|
- `end_date` (`YYYY-MM`)
|
||||||
|
- `main_domain_only` (`true` or omitted)
|
||||||
|
|
||||||
|
Optional API-specific parameters:
|
||||||
|
- `visits-total`, `bounce-rate`, `traffic-sources-desktop`, `traffic-sources-mobile`:
|
||||||
|
- `country` (default `world`)
|
||||||
|
- `granularity` (default `monthly`)
|
||||||
|
- `traffic-by-country`:
|
||||||
|
- `limit` (default `10`, max `10`)
|
||||||
|
|
||||||
|
## Limits
|
||||||
|
|
||||||
|
- Maximum lookback: 12 months
|
||||||
|
- `traffic-by-country`: max 3 months range
|
||||||
|
- Granularity: monthly
|
||||||
|
- Latest dependable month: last complete month
|
||||||
|
|
||||||
|
## Data Persistence Rule
|
||||||
|
|
||||||
|
Write every call to a JSON file immediately to avoid data loss when credits deplete or calls fail mid-run.
|
||||||
|
|
||||||
|
## Runtime Adapter Notes
|
||||||
|
|
||||||
|
Runtime file:
|
||||||
|
- `/opt/.manus/.sandbox-runtime/data_api.py`
|
||||||
|
|
||||||
|
Provisioning command:
|
||||||
|
- `/root/.codex/skills/similarweb-analytics/scripts/install_runtime_adapter.sh`
|
||||||
|
|
||||||
|
Credential modes:
|
||||||
|
- Preferred: `SIMILARWEB_API_KEY` for official Similarweb API
|
||||||
|
- Fallback: `RAPIDAPI_KEY` and optional `RAPIDAPI_SIMILARWEB_HOST` (default `similarweb13.p.rapidapi.com`)
|
||||||
@@ -0,0 +1,13 @@
|
|||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
ENV PYTHONDONTWRITEBYTECODE=1 \
|
||||||
|
PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
RUN groupadd -g 10001 app && \
|
||||||
|
useradd -m -u 10001 -g app -s /usr/sbin/nologin app
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
COPY entrypoint.py /app/entrypoint.py
|
||||||
|
|
||||||
|
USER app
|
||||||
|
ENTRYPOINT ["python", "/app/entrypoint.py"]
|
||||||
@@ -0,0 +1,249 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from datetime import date
|
||||||
|
from typing import Dict, Optional, Tuple
|
||||||
|
|
||||||
|
RUNTIME_PATH = "/opt/.manus/.sandbox-runtime"
|
||||||
|
|
||||||
|
API_MAP = {
|
||||||
|
"global-rank": "SimilarWeb/get_global_rank",
|
||||||
|
"visits-total": "SimilarWeb/get_visits_total",
|
||||||
|
"unique-visit": "SimilarWeb/get_unique_visit",
|
||||||
|
"bounce-rate": "SimilarWeb/get_bounce_rate",
|
||||||
|
"traffic-sources-desktop": "SimilarWeb/get_traffic_sources_desktop",
|
||||||
|
"traffic-sources-mobile": "SimilarWeb/get_traffic_sources_mobile",
|
||||||
|
"traffic-by-country": "SimilarWeb/get_total_traffic_by_country",
|
||||||
|
}
|
||||||
|
|
||||||
|
DEFAULT_MONTHS = {
|
||||||
|
"global-rank": 6,
|
||||||
|
"visits-total": 6,
|
||||||
|
"unique-visit": 6,
|
||||||
|
"bounce-rate": 6,
|
||||||
|
"traffic-sources-desktop": 3,
|
||||||
|
"traffic-sources-mobile": 3,
|
||||||
|
"traffic-by-country": 3,
|
||||||
|
}
|
||||||
|
|
||||||
|
COUNTRY_REQUIRED_APIS = {
|
||||||
|
"visits-total",
|
||||||
|
"bounce-rate",
|
||||||
|
"traffic-sources-desktop",
|
||||||
|
"traffic-sources-mobile",
|
||||||
|
}
|
||||||
|
|
||||||
|
DATE_RE = re.compile(r"^\d{4}-\d{2}$")
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class YearMonth:
|
||||||
|
year: int
|
||||||
|
month: int
|
||||||
|
|
||||||
|
def to_string(self) -> str:
|
||||||
|
return f"{self.year:04d}-{self.month:02d}"
|
||||||
|
|
||||||
|
def __lt__(self, other: "YearMonth") -> bool:
|
||||||
|
return (self.year, self.month) < (other.year, other.month)
|
||||||
|
|
||||||
|
def __le__(self, other: "YearMonth") -> bool:
|
||||||
|
return (self.year, self.month) <= (other.year, other.month)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_ym(value: str, field: str) -> YearMonth:
|
||||||
|
if not DATE_RE.match(value):
|
||||||
|
raise ValueError(f"{field} must be YYYY-MM, got {value!r}")
|
||||||
|
year = int(value[0:4])
|
||||||
|
month = int(value[5:7])
|
||||||
|
if month < 1 or month > 12:
|
||||||
|
raise ValueError(f"{field} month must be in 01..12, got {value!r}")
|
||||||
|
return YearMonth(year, month)
|
||||||
|
|
||||||
|
|
||||||
|
def shift_months(ym: YearMonth, delta: int) -> YearMonth:
|
||||||
|
zero_based = ym.year * 12 + (ym.month - 1) + delta
|
||||||
|
if zero_based < 0:
|
||||||
|
raise ValueError("date range underflow")
|
||||||
|
return YearMonth(zero_based // 12, (zero_based % 12) + 1)
|
||||||
|
|
||||||
|
|
||||||
|
def month_span(start: YearMonth, end: YearMonth) -> int:
|
||||||
|
return (end.year - start.year) * 12 + (end.month - start.month) + 1
|
||||||
|
|
||||||
|
|
||||||
|
def last_complete_month(today: date) -> YearMonth:
|
||||||
|
current = YearMonth(today.year, today.month)
|
||||||
|
return shift_months(current, -1)
|
||||||
|
|
||||||
|
|
||||||
|
def default_date_range(api: str, start: Optional[str], end: Optional[str]) -> Tuple[YearMonth, YearMonth]:
|
||||||
|
window = DEFAULT_MONTHS[api]
|
||||||
|
lcm = last_complete_month(date.today())
|
||||||
|
|
||||||
|
end_ym = parse_ym(end, "end_date") if end else lcm
|
||||||
|
start_ym = parse_ym(start, "start_date") if start else shift_months(end_ym, -(window - 1))
|
||||||
|
|
||||||
|
return start_ym, end_ym
|
||||||
|
|
||||||
|
|
||||||
|
def validate_range(api: str, start_ym: YearMonth, end_ym: YearMonth) -> None:
|
||||||
|
if end_ym < start_ym:
|
||||||
|
raise ValueError("end_date must be >= start_date")
|
||||||
|
|
||||||
|
lcm = last_complete_month(date.today())
|
||||||
|
oldest_allowed = shift_months(lcm, -11)
|
||||||
|
|
||||||
|
if end_ym > lcm:
|
||||||
|
raise ValueError(f"end_date must be <= last complete month {lcm.to_string()}")
|
||||||
|
if start_ym < oldest_allowed:
|
||||||
|
raise ValueError(f"start_date must be >= {oldest_allowed.to_string()} (12-month lookback)")
|
||||||
|
|
||||||
|
span = month_span(start_ym, end_ym)
|
||||||
|
if span > 12:
|
||||||
|
raise ValueError("date range cannot exceed 12 months")
|
||||||
|
if api == "traffic-by-country" and span > 3:
|
||||||
|
raise ValueError("traffic-by-country supports at most 3 months")
|
||||||
|
|
||||||
|
|
||||||
|
def sanitize_filename(value: str) -> str:
|
||||||
|
safe = re.sub(r"[^a-zA-Z0-9_.-]+", "-", value.strip())
|
||||||
|
return safe.strip("-") or "result"
|
||||||
|
|
||||||
|
|
||||||
|
def resolve_output_path(api: str, domain: str, output: Optional[str]) -> str:
|
||||||
|
if output:
|
||||||
|
return output
|
||||||
|
file_name = f"{sanitize_filename(api)}-{sanitize_filename(domain)}.json"
|
||||||
|
return os.path.join("/data", file_name)
|
||||||
|
|
||||||
|
|
||||||
|
def build_query(args: argparse.Namespace, start_ym: YearMonth, end_ym: YearMonth) -> Dict[str, object]:
|
||||||
|
query: Dict[str, object] = {
|
||||||
|
"start_date": start_ym.to_string(),
|
||||||
|
"end_date": end_ym.to_string(),
|
||||||
|
}
|
||||||
|
|
||||||
|
if args.main_domain_only:
|
||||||
|
query["main_domain_only"] = True
|
||||||
|
|
||||||
|
if args.api in COUNTRY_REQUIRED_APIS:
|
||||||
|
query["country"] = args.country
|
||||||
|
query["granularity"] = args.granularity
|
||||||
|
elif args.api == "traffic-by-country":
|
||||||
|
query["limit"] = args.limit
|
||||||
|
|
||||||
|
return query
|
||||||
|
|
||||||
|
|
||||||
|
def import_api_client():
|
||||||
|
sys.path.insert(0, RUNTIME_PATH)
|
||||||
|
try:
|
||||||
|
from data_api import ApiClient # type: ignore
|
||||||
|
except Exception as exc: # pragma: no cover
|
||||||
|
raise RuntimeError(
|
||||||
|
"data_api import failed. Ensure runtime is mounted to /opt/.manus/.sandbox-runtime"
|
||||||
|
) from exc
|
||||||
|
return ApiClient
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args() -> argparse.Namespace:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Call SimilarWeb APIs using ApiClient inside Docker and persist output JSON."
|
||||||
|
)
|
||||||
|
parser.add_argument("--api", choices=sorted(API_MAP.keys()))
|
||||||
|
parser.add_argument("--domain")
|
||||||
|
parser.add_argument("--start-date")
|
||||||
|
parser.add_argument("--end-date")
|
||||||
|
parser.add_argument("--country", default="world")
|
||||||
|
parser.add_argument("--granularity", default="monthly")
|
||||||
|
parser.add_argument("--limit", type=int, default=10)
|
||||||
|
parser.add_argument("--main-domain-only", action="store_true")
|
||||||
|
parser.add_argument("--output")
|
||||||
|
parser.add_argument("--dry-run", action="store_true")
|
||||||
|
parser.add_argument("--mock-result", action="store_true")
|
||||||
|
parser.add_argument("--self-test", action="store_true")
|
||||||
|
return parser.parse_args()
|
||||||
|
|
||||||
|
|
||||||
|
def write_payload(path: str, payload: Dict[str, object]) -> None:
|
||||||
|
parent = os.path.dirname(path)
|
||||||
|
if parent:
|
||||||
|
os.makedirs(parent, exist_ok=True)
|
||||||
|
with open(path, "w", encoding="utf-8") as f:
|
||||||
|
json.dump(payload, f, ensure_ascii=False, indent=2)
|
||||||
|
f.write("\n")
|
||||||
|
|
||||||
|
|
||||||
|
def run() -> int:
|
||||||
|
args = parse_args()
|
||||||
|
|
||||||
|
if args.self_test:
|
||||||
|
result = {
|
||||||
|
"ok": True,
|
||||||
|
"runtime_path": RUNTIME_PATH,
|
||||||
|
"runtime_exists": os.path.isdir(RUNTIME_PATH),
|
||||||
|
"python_version": sys.version.split()[0],
|
||||||
|
}
|
||||||
|
print(json.dumps(result, ensure_ascii=False))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
if not args.api or not args.domain:
|
||||||
|
raise ValueError("--api and --domain are required unless --self-test is used")
|
||||||
|
|
||||||
|
if args.limit < 1 or args.limit > 10:
|
||||||
|
raise ValueError("--limit must be between 1 and 10")
|
||||||
|
|
||||||
|
start_ym, end_ym = default_date_range(args.api, args.start_date, args.end_date)
|
||||||
|
validate_range(args.api, start_ym, end_ym)
|
||||||
|
|
||||||
|
endpoint = API_MAP[args.api]
|
||||||
|
query = build_query(args, start_ym, end_ym)
|
||||||
|
output_path = resolve_output_path(args.api, args.domain, args.output)
|
||||||
|
|
||||||
|
request_meta = {
|
||||||
|
"api": args.api,
|
||||||
|
"endpoint": endpoint,
|
||||||
|
"domain": args.domain,
|
||||||
|
"query": query,
|
||||||
|
"output": output_path,
|
||||||
|
"dry_run": bool(args.dry_run),
|
||||||
|
"mock_result": bool(args.mock_result),
|
||||||
|
}
|
||||||
|
|
||||||
|
if args.dry_run:
|
||||||
|
print(json.dumps({"ok": True, "request": request_meta}, ensure_ascii=False))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
if args.mock_result:
|
||||||
|
payload = {
|
||||||
|
"request": request_meta,
|
||||||
|
"result": {
|
||||||
|
"source": "mock",
|
||||||
|
"message": "mock_result enabled",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
write_payload(output_path, payload)
|
||||||
|
print(json.dumps({"ok": True, "output": output_path, "mode": "mock"}, ensure_ascii=False))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
ApiClient = import_api_client()
|
||||||
|
client = ApiClient()
|
||||||
|
result = client.call_api(endpoint, path_params={"domain": args.domain}, query=query)
|
||||||
|
payload = {"request": request_meta, "result": result}
|
||||||
|
write_payload(output_path, payload)
|
||||||
|
|
||||||
|
print(json.dumps({"ok": True, "output": output_path, "endpoint": endpoint}, ensure_ascii=False))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
try:
|
||||||
|
raise SystemExit(run())
|
||||||
|
except Exception as exc:
|
||||||
|
print(json.dumps({"ok": False, "error": str(exc)}, ensure_ascii=False), file=sys.stderr)
|
||||||
|
raise SystemExit(1)
|
||||||
@@ -0,0 +1,38 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
usage() {
|
||||||
|
cat <<'EOF'
|
||||||
|
Usage:
|
||||||
|
install_runtime_adapter.sh [target_dir]
|
||||||
|
|
||||||
|
Default target_dir:
|
||||||
|
/opt/.manus/.sandbox-runtime
|
||||||
|
|
||||||
|
Installs:
|
||||||
|
data_api.py
|
||||||
|
from this skill into the target runtime directory.
|
||||||
|
EOF
|
||||||
|
}
|
||||||
|
|
||||||
|
if [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
|
||||||
|
usage
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
SRC="$SCRIPT_DIR/runtime/data_api.py"
|
||||||
|
TARGET_DIR="${1:-/opt/.manus/.sandbox-runtime}"
|
||||||
|
TARGET="$TARGET_DIR/data_api.py"
|
||||||
|
|
||||||
|
if [[ ! -f "$SRC" ]]; then
|
||||||
|
echo "Source file missing: $SRC" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
mkdir -p "$TARGET_DIR"
|
||||||
|
cp -f "$SRC" "$TARGET"
|
||||||
|
chmod 755 "$TARGET"
|
||||||
|
|
||||||
|
echo "Installed runtime adapter:"
|
||||||
|
echo " $TARGET"
|
||||||
@@ -0,0 +1,128 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
usage() {
|
||||||
|
cat <<'EOF'
|
||||||
|
Usage:
|
||||||
|
run_in_docker.sh [runner options] -- [entrypoint args]
|
||||||
|
|
||||||
|
Runner options:
|
||||||
|
--build Build image before running
|
||||||
|
--image <name> Override image name (default: codex/similarweb-analytics:latest)
|
||||||
|
--runtime-dir <path> Host path that contains data_api.py (default: /opt/.manus/.sandbox-runtime)
|
||||||
|
--output-dir <path> Host output directory mounted to /data (default: ./similarweb-output)
|
||||||
|
--network <mode> Docker network mode (default: bridge)
|
||||||
|
-h, --help Show this message
|
||||||
|
|
||||||
|
Entrypoint args:
|
||||||
|
--self-test
|
||||||
|
--api <global-rank|visits-total|unique-visit|bounce-rate|traffic-sources-desktop|traffic-sources-mobile|traffic-by-country>
|
||||||
|
--domain <domain>
|
||||||
|
--start-date YYYY-MM
|
||||||
|
--end-date YYYY-MM
|
||||||
|
--country <country>
|
||||||
|
--granularity monthly
|
||||||
|
--limit <1..10>
|
||||||
|
--main-domain-only
|
||||||
|
--output /data/<file>.json
|
||||||
|
--dry-run
|
||||||
|
--mock-result
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
run_in_docker.sh --build -- --self-test
|
||||||
|
run_in_docker.sh -- --api visits-total --domain amazon.com --dry-run
|
||||||
|
run_in_docker.sh -- --api global-rank --domain amazon.com --output /data/amazon-rank.json
|
||||||
|
EOF
|
||||||
|
}
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
IMAGE="${SIMILARWEB_IMAGE:-codex/similarweb-analytics:latest}"
|
||||||
|
RUNTIME_DIR="${SIMILARWEB_RUNTIME_DIR:-/opt/.manus/.sandbox-runtime}"
|
||||||
|
OUTPUT_DIR="${SIMILARWEB_OUTPUT_DIR:-$PWD/similarweb-output}"
|
||||||
|
NETWORK_MODE="${SIMILARWEB_NETWORK_MODE:-bridge}"
|
||||||
|
BUILD_IMAGE=0
|
||||||
|
|
||||||
|
while [[ $# -gt 0 ]]; do
|
||||||
|
case "$1" in
|
||||||
|
--build)
|
||||||
|
BUILD_IMAGE=1
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
--image)
|
||||||
|
IMAGE="${2:-}"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
--runtime-dir)
|
||||||
|
RUNTIME_DIR="${2:-}"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
--output-dir)
|
||||||
|
OUTPUT_DIR="${2:-}"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
--network)
|
||||||
|
NETWORK_MODE="${2:-}"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
--)
|
||||||
|
shift
|
||||||
|
break
|
||||||
|
;;
|
||||||
|
-h|--help)
|
||||||
|
usage
|
||||||
|
exit 0
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo "Unknown runner option: $1" >&2
|
||||||
|
usage >&2
|
||||||
|
exit 2
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
|
||||||
|
if [[ $# -eq 0 ]]; then
|
||||||
|
echo "Missing entrypoint args. Use -- to pass container args." >&2
|
||||||
|
usage >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! command -v docker >/dev/null 2>&1; then
|
||||||
|
echo "docker command not found" >&2
|
||||||
|
exit 127
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ ! -d "$RUNTIME_DIR" ]]; then
|
||||||
|
echo "Runtime dir not found: $RUNTIME_DIR" >&2
|
||||||
|
echo "It must contain data_api.py for real API calls." >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
if [[ ! -f "$RUNTIME_DIR/data_api.py" ]]; then
|
||||||
|
echo "Runtime module missing: $RUNTIME_DIR/data_api.py" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
mkdir -p "$OUTPUT_DIR"
|
||||||
|
# Keep container non-root while ensuring mounted output path is writable.
|
||||||
|
chmod 0777 "$OUTPUT_DIR" 2>/dev/null || true
|
||||||
|
|
||||||
|
if [[ "$BUILD_IMAGE" -eq 1 ]] || ! docker image inspect "$IMAGE" >/dev/null 2>&1; then
|
||||||
|
docker build -t "$IMAGE" -f "$SCRIPT_DIR/docker/Dockerfile" "$SCRIPT_DIR/docker"
|
||||||
|
fi
|
||||||
|
|
||||||
|
docker run --rm \
|
||||||
|
--network "$NETWORK_MODE" \
|
||||||
|
--read-only \
|
||||||
|
--tmpfs /tmp:rw,noexec,nosuid,size=64m \
|
||||||
|
--tmpfs /var/tmp:rw,noexec,nosuid,size=32m \
|
||||||
|
--cap-drop ALL \
|
||||||
|
--security-opt no-new-privileges \
|
||||||
|
--pids-limit 256 \
|
||||||
|
--memory 512m \
|
||||||
|
--cpus 1.0 \
|
||||||
|
-e SIMILARWEB_API_KEY \
|
||||||
|
-e SIMILARWEB_BASE_URL \
|
||||||
|
-e RAPIDAPI_KEY \
|
||||||
|
-e RAPIDAPI_SIMILARWEB_HOST \
|
||||||
|
-v "$RUNTIME_DIR:/opt/.manus/.sandbox-runtime:ro" \
|
||||||
|
-v "$OUTPUT_DIR:/data:rw" \
|
||||||
|
"$IMAGE" "$@"
|
||||||
@@ -0,0 +1,166 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Minimal ApiClient runtime for SimilarWeb skill.
|
||||||
|
|
||||||
|
Implements the subset of Manus-style interface used by the skill:
|
||||||
|
ApiClient().call_api(api_name, path_params={"domain": ...}, query={...})
|
||||||
|
|
||||||
|
Primary mode: Similarweb official API (requires SIMILARWEB_API_KEY)
|
||||||
|
Fallback mode: RapidAPI similarweb13 domain snapshot (requires RAPIDAPI_KEY)
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import urllib.parse
|
||||||
|
import urllib.request
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Any, Dict, Mapping, Optional
|
||||||
|
|
||||||
|
|
||||||
|
class ApiError(RuntimeError):
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class EndpointSpec:
|
||||||
|
path: str
|
||||||
|
|
||||||
|
|
||||||
|
OFFICIAL_ENDPOINTS: Dict[str, EndpointSpec] = {
|
||||||
|
"SimilarWeb/get_global_rank": EndpointSpec("/v1/website/{domain}/global-rank/global-rank"),
|
||||||
|
"SimilarWeb/get_visits_total": EndpointSpec("/v1/website/{domain}/total-traffic-and-engagement/visits"),
|
||||||
|
"SimilarWeb/get_unique_visit": EndpointSpec("/v1/website/{domain}/deduplicated-audience/deduplicated-audience"),
|
||||||
|
"SimilarWeb/get_bounce_rate": EndpointSpec("/v1/website/{domain}/total-traffic-and-engagement/bounce-rate"),
|
||||||
|
"SimilarWeb/get_traffic_sources_desktop": EndpointSpec("/v1/website/{domain}/traffic-sources/desktop"),
|
||||||
|
"SimilarWeb/get_traffic_sources_mobile": EndpointSpec("/v1/website/{domain}/traffic-sources/mobile-web"),
|
||||||
|
"SimilarWeb/get_total_traffic_by_country": EndpointSpec("/v1/website/{domain}/geography/total-traffic-and-engagement"),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class ApiClient:
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
similarweb_api_key: Optional[str] = None,
|
||||||
|
similarweb_base_url: Optional[str] = None,
|
||||||
|
rapidapi_key: Optional[str] = None,
|
||||||
|
rapidapi_host: Optional[str] = None,
|
||||||
|
timeout: int = 30,
|
||||||
|
) -> None:
|
||||||
|
self.similarweb_api_key = similarweb_api_key or os.getenv("SIMILARWEB_API_KEY")
|
||||||
|
self.similarweb_base_url = (
|
||||||
|
similarweb_base_url
|
||||||
|
or os.getenv("SIMILARWEB_BASE_URL")
|
||||||
|
or "https://api.similarweb.com"
|
||||||
|
).rstrip("/")
|
||||||
|
self.rapidapi_key = rapidapi_key or os.getenv("RAPIDAPI_KEY")
|
||||||
|
self.rapidapi_host = rapidapi_host or os.getenv("RAPIDAPI_SIMILARWEB_HOST") or "similarweb13.p.rapidapi.com"
|
||||||
|
self.timeout = timeout
|
||||||
|
|
||||||
|
def call_api(
|
||||||
|
self,
|
||||||
|
api_name: str,
|
||||||
|
*,
|
||||||
|
path_params: Optional[Mapping[str, Any]] = None,
|
||||||
|
query: Optional[Mapping[str, Any]] = None,
|
||||||
|
) -> Dict[str, Any]:
|
||||||
|
path_params = dict(path_params or {})
|
||||||
|
query = dict(query or {})
|
||||||
|
|
||||||
|
domain = str(path_params.get("domain", "")).strip()
|
||||||
|
if not domain:
|
||||||
|
raise ApiError("path_params.domain is required")
|
||||||
|
|
||||||
|
if self.similarweb_api_key:
|
||||||
|
return self._call_official(api_name, domain=domain, query=query)
|
||||||
|
|
||||||
|
if self.rapidapi_key:
|
||||||
|
return self._call_rapidapi_snapshot(api_name, domain=domain, query=query)
|
||||||
|
|
||||||
|
raise ApiError(
|
||||||
|
"No credentials configured. Set SIMILARWEB_API_KEY (preferred) or RAPIDAPI_KEY."
|
||||||
|
)
|
||||||
|
|
||||||
|
def _call_official(self, api_name: str, *, domain: str, query: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
|
spec = OFFICIAL_ENDPOINTS.get(api_name)
|
||||||
|
if not spec:
|
||||||
|
raise ApiError(f"Unsupported api_name for official mode: {api_name}")
|
||||||
|
|
||||||
|
path = spec.path.format(domain=domain)
|
||||||
|
q = self._clean_query(query)
|
||||||
|
q["api_key"] = self.similarweb_api_key
|
||||||
|
url = f"{self.similarweb_base_url}{path}?{urllib.parse.urlencode(q)}"
|
||||||
|
|
||||||
|
req = urllib.request.Request(url=url, method="GET")
|
||||||
|
return self._do_request(req, mode="official", api_name=api_name, url=url)
|
||||||
|
|
||||||
|
def _call_rapidapi_snapshot(self, api_name: str, *, domain: str, query: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
|
encoded_domain = urllib.parse.quote(domain)
|
||||||
|
url = f"https://{self.rapidapi_host}/v2/getdomain?domain={encoded_domain}"
|
||||||
|
headers = {
|
||||||
|
"x-rapidapi-key": self.rapidapi_key or "",
|
||||||
|
"x-rapidapi-host": self.rapidapi_host,
|
||||||
|
}
|
||||||
|
req = urllib.request.Request(url=url, method="GET", headers=headers)
|
||||||
|
|
||||||
|
resp = self._do_request(req, mode="rapidapi", api_name=api_name, url=url)
|
||||||
|
return {
|
||||||
|
"_adapter": {
|
||||||
|
"mode": "rapidapi",
|
||||||
|
"note": "Using /v2/getdomain snapshot fallback; not 1:1 with official endpoint schema.",
|
||||||
|
"requested_api": api_name,
|
||||||
|
"requested_query": query,
|
||||||
|
},
|
||||||
|
"data": resp,
|
||||||
|
}
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _clean_query(query: Mapping[str, Any]) -> Dict[str, Any]:
|
||||||
|
out: Dict[str, Any] = {}
|
||||||
|
for k, v in query.items():
|
||||||
|
if v is None:
|
||||||
|
continue
|
||||||
|
if isinstance(v, bool):
|
||||||
|
out[k] = "true" if v else "false"
|
||||||
|
else:
|
||||||
|
out[k] = str(v)
|
||||||
|
return out
|
||||||
|
|
||||||
|
def _do_request(self, req: urllib.request.Request, *, mode: str, api_name: str, url: str) -> Dict[str, Any]:
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=self.timeout) as resp:
|
||||||
|
body = resp.read().decode("utf-8", errors="replace")
|
||||||
|
data = json.loads(body) if body else {}
|
||||||
|
return {
|
||||||
|
"_meta": {
|
||||||
|
"mode": mode,
|
||||||
|
"api_name": api_name,
|
||||||
|
"http_status": resp.status,
|
||||||
|
"url": url,
|
||||||
|
},
|
||||||
|
"response": data,
|
||||||
|
}
|
||||||
|
except urllib.error.HTTPError as exc:
|
||||||
|
body = exc.read().decode("utf-8", errors="replace")
|
||||||
|
try:
|
||||||
|
parsed = json.loads(body)
|
||||||
|
except Exception:
|
||||||
|
parsed = {"raw": body}
|
||||||
|
raise ApiError(
|
||||||
|
json.dumps(
|
||||||
|
{
|
||||||
|
"http_status": exc.code,
|
||||||
|
"mode": mode,
|
||||||
|
"api_name": api_name,
|
||||||
|
"url": url,
|
||||||
|
"error": parsed,
|
||||||
|
},
|
||||||
|
ensure_ascii=False,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
except urllib.error.URLError as exc:
|
||||||
|
raise ApiError(f"Network error for {url}: {exc}")
|
||||||
|
|
||||||
|
|
||||||
|
__all__ = ["ApiClient", "ApiError"]
|
||||||
@@ -0,0 +1,40 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
RUNTIME_FIXTURE_DIR="$SCRIPT_DIR/tests/fixtures"
|
||||||
|
OUTPUT_DIR="${1:-$SCRIPT_DIR/../tmp/test-output}"
|
||||||
|
RUNNER="$SCRIPT_DIR/run_in_docker.sh"
|
||||||
|
|
||||||
|
mkdir -p "$OUTPUT_DIR"
|
||||||
|
|
||||||
|
echo "[1/4] Build image + self-test"
|
||||||
|
"$RUNNER" --build --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- --self-test
|
||||||
|
|
||||||
|
echo "[2/4] Dry-run validation"
|
||||||
|
"$RUNNER" --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- \
|
||||||
|
--api visits-total \
|
||||||
|
--domain amazon.com \
|
||||||
|
--country world \
|
||||||
|
--dry-run
|
||||||
|
|
||||||
|
echo "[3/4] Mock call writes output file"
|
||||||
|
"$RUNNER" --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- \
|
||||||
|
--api global-rank \
|
||||||
|
--domain amazon.com \
|
||||||
|
--mock-result \
|
||||||
|
--output /data/mock-global-rank.json
|
||||||
|
|
||||||
|
test -f "$OUTPUT_DIR/mock-global-rank.json"
|
||||||
|
|
||||||
|
echo "[4/4] Fixture ApiClient end-to-end call writes output"
|
||||||
|
"$RUNNER" --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- \
|
||||||
|
--api traffic-by-country \
|
||||||
|
--domain amazon.com \
|
||||||
|
--start-date 2025-12 \
|
||||||
|
--end-date 2026-02 \
|
||||||
|
--limit 3 \
|
||||||
|
--output /data/fixture-traffic-by-country.json
|
||||||
|
|
||||||
|
test -f "$OUTPUT_DIR/fixture-traffic-by-country.json"
|
||||||
|
echo "All tests passed. Output dir: $OUTPUT_DIR"
|
||||||
8
similarweb-analytics/scripts/tests/fixtures/data_api.py
vendored
普通文件
8
similarweb-analytics/scripts/tests/fixtures/data_api.py
vendored
普通文件
@@ -0,0 +1,8 @@
|
|||||||
|
class ApiClient:
|
||||||
|
def call_api(self, api_name, path_params=None, query=None):
|
||||||
|
return {
|
||||||
|
"fixture": True,
|
||||||
|
"api_name": api_name,
|
||||||
|
"path_params": path_params or {},
|
||||||
|
"query": query or {},
|
||||||
|
}
|
||||||
在新工单中引用
屏蔽一个用户