比较提交

..

2 次代码提交

作者 SHA1 备注 提交日期
cryptocommuniums-afk
be5f7c8808 Remove Python cache files from similarweb-analytics skill 2026-03-05 10:28:21 +08:00
cryptocommuniums-afk
5e0c64db9e Add similarweb-analytics Docker sandbox skill 2026-03-05 10:28:14 +08:00
修改 10 个文件,包含 856 行新增0 行删除

查看文件

@@ -0,0 +1,156 @@
---
name: similarweb-analytics
description: Analyze website and domain traffic with SimilarWeb APIs through a Docker sandbox. Use for visits, unique visitors, rank, bounce rate, traffic sources, traffic by country, and domain comparison research.
---
# SimilarWeb Analytics
## Overview
Use this skill to run SimilarWeb analytics in an isolated Docker container and save every API response to JSON immediately.
Use it when the user asks about domain traffic, popularity ranking, engagement quality, channel mix, or country-level traffic split.
## Trigger Cues
Use this skill when the request includes one or more of these cues:
- Domain inputs such as `google.com`, `amazon.com`, `openai.com`
- Traffic words such as `visits`, `unique visitors`, `traffic trend`
- Ranking words such as `global rank`, `website rank`
- Engagement words such as `bounce rate`, `pages per visit`, `visit duration`
- Source words such as `organic`, `paid`, `direct`, `social`, `referrals`
- Geography words such as `top countries`, `country split`, `regional traffic`
- Comparison words such as `compare`, `vs`, `benchmark`
## Workflow
1. Parse user intent into API call inputs:
- `domain` (required)
- `api` (required)
- Optional: `start_date`, `end_date`, `country`, `granularity`, `limit`, `main_domain_only`
2. Build image when needed:
- Run `scripts/run_in_docker.sh --build -- --self-test`
3. Execute query in Docker sandbox:
- Run `scripts/run_in_docker.sh -- --api <api> --domain <domain> ...`
4. Persist output on every call:
- Always pass `--output /data/<file>.json` or let auto filename run in `/data`
- Never keep API output only in terminal output
5. For comparisons:
- Execute one call per domain with the same time window
- Save each domain response as a separate JSON file for reproducible analysis
## Command Entry Points
- Main host wrapper: `scripts/run_in_docker.sh`
- Container entrypoint: `scripts/docker/entrypoint.py`
- Image definition: `scripts/docker/Dockerfile`
- Runtime adapter installer: `scripts/install_runtime_adapter.sh`
- Runtime adapter source: `scripts/runtime/data_api.py`
- Test runner: `scripts/test_docker_workflow.sh`
## Quick Start
Install runtime adapter to expected host path:
```bash
/root/.codex/skills/similarweb-analytics/scripts/install_runtime_adapter.sh
```
Build image and verify runtime:
```bash
/root/.codex/skills/similarweb-analytics/scripts/run_in_docker.sh --build -- --self-test
```
Dry run without consuming API credits:
```bash
/root/.codex/skills/similarweb-analytics/scripts/run_in_docker.sh -- \
--api visits-total \
--domain amazon.com \
--country world \
--dry-run
```
Real call and save data immediately:
```bash
/root/.codex/skills/similarweb-analytics/scripts/run_in_docker.sh -- \
--api traffic-by-country \
--domain amazon.com \
--start-date 2025-12 \
--end-date 2026-02 \
--limit 10 \
--output /data/amazon-country.json
```
## Supported APIs
- `global-rank` -> `SimilarWeb/get_global_rank`
- `visits-total` -> `SimilarWeb/get_visits_total`
- `unique-visit` -> `SimilarWeb/get_unique_visit`
- `bounce-rate` -> `SimilarWeb/get_bounce_rate`
- `traffic-sources-desktop` -> `SimilarWeb/get_traffic_sources_desktop`
- `traffic-sources-mobile` -> `SimilarWeb/get_traffic_sources_mobile`
- `traffic-by-country` -> `SimilarWeb/get_total_traffic_by_country`
For parameter matrix and defaults, see `references/api-matrix.md`.
## Sandbox Rules
`scripts/run_in_docker.sh` runs with:
- Non-root container user
- Read-only root filesystem
- `tmpfs` only for `/tmp` and `/var/tmp`
- Dropped Linux capabilities (`--cap-drop ALL`)
- `no-new-privileges` enabled
- CPU, memory, and PID limits
Runtime dependency mount:
- Must mount host runtime path into container at `/opt/.manus/.sandbox-runtime`
- Default host path is `/opt/.manus/.sandbox-runtime`
- You can override with `--runtime-dir <path>`
- `data_api.py` must exist in that runtime directory
Credential pass-through:
- `SIMILARWEB_API_KEY` for official Similarweb API mode
- Optional fallback: `RAPIDAPI_KEY` and `RAPIDAPI_SIMILARWEB_HOST`
- Runner auto-forwards these env vars into container when present
## Data Constraints
- Historical data window is at most 12 months
- `traffic-by-country` is limited to at most 3 months
- Latest reliable month is the last complete month
- Default date range:
- 6 months: `global-rank`, `visits-total`, `unique-visit`, `bounce-rate`
- 3 months: `traffic-sources-desktop`, `traffic-sources-mobile`, `traffic-by-country`
## Validation Record
Last validated on `2026-03-05`:
- Docker image build succeeded
- Container self-test succeeded
- End-to-end fixture call succeeded and wrote JSON output
- Skill structure validation succeeded with `quick_validate.py`
- Runtime adapter installed to `/opt/.manus/.sandbox-runtime/data_api.py` and imported successfully
- Official mode live call attempted and failed fast with explicit credential error when `SIMILARWEB_API_KEY` is unset
- Live network call attempted via RapidAPI fallback; request reached provider and returned `403 not subscribed` (credential/subscription issue, not runtime failure)
## Troubleshooting
- Error `data_api import failed`:
- Check that runtime path exists on host and is mounted to `/opt/.manus/.sandbox-runtime`
- Error about date range:
- Use `YYYY-MM` format and keep range inside API limits
- No output file:
- Ensure output points to `/data/...` inside container or mounted output directory from host
## Resources
- `scripts/docker/Dockerfile`: container image for sandbox runtime
- `scripts/docker/entrypoint.py`: SimilarWeb API caller inside container
- `scripts/run_in_docker.sh`: host wrapper for build and secure execution
- `scripts/install_runtime_adapter.sh`: install runtime adapter into `/opt/.manus/.sandbox-runtime`
- `scripts/runtime/data_api.py`: `ApiClient` adapter implementation
- `scripts/test_docker_workflow.sh`: reproducible smoke test script
- `references/api-matrix.md`: endpoint and parameter matrix

查看文件

@@ -0,0 +1,4 @@
interface:
display_name: "SimilarWeb Analytics"
short_description: "Analyze domains with SimilarWeb in a Docker sandbox"
default_prompt: "Analyze traffic, rank, sources, and geography for a domain using Dockerized SimilarWeb workflow."

查看文件

@@ -0,0 +1,54 @@
# SimilarWeb API Matrix
## Endpoint Mapping
| CLI `--api` value | API name | Default window |
| --- | --- | --- |
| `global-rank` | `SimilarWeb/get_global_rank` | 6 months |
| `visits-total` | `SimilarWeb/get_visits_total` | 6 months |
| `unique-visit` | `SimilarWeb/get_unique_visit` | 6 months |
| `bounce-rate` | `SimilarWeb/get_bounce_rate` | 6 months |
| `traffic-sources-desktop` | `SimilarWeb/get_traffic_sources_desktop` | 3 months |
| `traffic-sources-mobile` | `SimilarWeb/get_traffic_sources_mobile` | 3 months |
| `traffic-by-country` | `SimilarWeb/get_total_traffic_by_country` | 3 months |
## Parameters
Required:
- `domain`
- `api`
Optional shared parameters:
- `start_date` (`YYYY-MM`)
- `end_date` (`YYYY-MM`)
- `main_domain_only` (`true` or omitted)
Optional API-specific parameters:
- `visits-total`, `bounce-rate`, `traffic-sources-desktop`, `traffic-sources-mobile`:
- `country` (default `world`)
- `granularity` (default `monthly`)
- `traffic-by-country`:
- `limit` (default `10`, max `10`)
## Limits
- Maximum lookback: 12 months
- `traffic-by-country`: max 3 months range
- Granularity: monthly
- Latest dependable month: last complete month
## Data Persistence Rule
Write every call to a JSON file immediately to avoid data loss when credits deplete or calls fail mid-run.
## Runtime Adapter Notes
Runtime file:
- `/opt/.manus/.sandbox-runtime/data_api.py`
Provisioning command:
- `/root/.codex/skills/similarweb-analytics/scripts/install_runtime_adapter.sh`
Credential modes:
- Preferred: `SIMILARWEB_API_KEY` for official Similarweb API
- Fallback: `RAPIDAPI_KEY` and optional `RAPIDAPI_SIMILARWEB_HOST` (default `similarweb13.p.rapidapi.com`)

查看文件

@@ -0,0 +1,13 @@
FROM python:3.11-slim
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1
RUN groupadd -g 10001 app && \
useradd -m -u 10001 -g app -s /usr/sbin/nologin app
WORKDIR /app
COPY entrypoint.py /app/entrypoint.py
USER app
ENTRYPOINT ["python", "/app/entrypoint.py"]

查看文件

@@ -0,0 +1,249 @@
#!/usr/bin/env python3
import argparse
import json
import os
import re
import sys
from dataclasses import dataclass
from datetime import date
from typing import Dict, Optional, Tuple
RUNTIME_PATH = "/opt/.manus/.sandbox-runtime"
API_MAP = {
"global-rank": "SimilarWeb/get_global_rank",
"visits-total": "SimilarWeb/get_visits_total",
"unique-visit": "SimilarWeb/get_unique_visit",
"bounce-rate": "SimilarWeb/get_bounce_rate",
"traffic-sources-desktop": "SimilarWeb/get_traffic_sources_desktop",
"traffic-sources-mobile": "SimilarWeb/get_traffic_sources_mobile",
"traffic-by-country": "SimilarWeb/get_total_traffic_by_country",
}
DEFAULT_MONTHS = {
"global-rank": 6,
"visits-total": 6,
"unique-visit": 6,
"bounce-rate": 6,
"traffic-sources-desktop": 3,
"traffic-sources-mobile": 3,
"traffic-by-country": 3,
}
COUNTRY_REQUIRED_APIS = {
"visits-total",
"bounce-rate",
"traffic-sources-desktop",
"traffic-sources-mobile",
}
DATE_RE = re.compile(r"^\d{4}-\d{2}$")
@dataclass(frozen=True)
class YearMonth:
year: int
month: int
def to_string(self) -> str:
return f"{self.year:04d}-{self.month:02d}"
def __lt__(self, other: "YearMonth") -> bool:
return (self.year, self.month) < (other.year, other.month)
def __le__(self, other: "YearMonth") -> bool:
return (self.year, self.month) <= (other.year, other.month)
def parse_ym(value: str, field: str) -> YearMonth:
if not DATE_RE.match(value):
raise ValueError(f"{field} must be YYYY-MM, got {value!r}")
year = int(value[0:4])
month = int(value[5:7])
if month < 1 or month > 12:
raise ValueError(f"{field} month must be in 01..12, got {value!r}")
return YearMonth(year, month)
def shift_months(ym: YearMonth, delta: int) -> YearMonth:
zero_based = ym.year * 12 + (ym.month - 1) + delta
if zero_based < 0:
raise ValueError("date range underflow")
return YearMonth(zero_based // 12, (zero_based % 12) + 1)
def month_span(start: YearMonth, end: YearMonth) -> int:
return (end.year - start.year) * 12 + (end.month - start.month) + 1
def last_complete_month(today: date) -> YearMonth:
current = YearMonth(today.year, today.month)
return shift_months(current, -1)
def default_date_range(api: str, start: Optional[str], end: Optional[str]) -> Tuple[YearMonth, YearMonth]:
window = DEFAULT_MONTHS[api]
lcm = last_complete_month(date.today())
end_ym = parse_ym(end, "end_date") if end else lcm
start_ym = parse_ym(start, "start_date") if start else shift_months(end_ym, -(window - 1))
return start_ym, end_ym
def validate_range(api: str, start_ym: YearMonth, end_ym: YearMonth) -> None:
if end_ym < start_ym:
raise ValueError("end_date must be >= start_date")
lcm = last_complete_month(date.today())
oldest_allowed = shift_months(lcm, -11)
if end_ym > lcm:
raise ValueError(f"end_date must be <= last complete month {lcm.to_string()}")
if start_ym < oldest_allowed:
raise ValueError(f"start_date must be >= {oldest_allowed.to_string()} (12-month lookback)")
span = month_span(start_ym, end_ym)
if span > 12:
raise ValueError("date range cannot exceed 12 months")
if api == "traffic-by-country" and span > 3:
raise ValueError("traffic-by-country supports at most 3 months")
def sanitize_filename(value: str) -> str:
safe = re.sub(r"[^a-zA-Z0-9_.-]+", "-", value.strip())
return safe.strip("-") or "result"
def resolve_output_path(api: str, domain: str, output: Optional[str]) -> str:
if output:
return output
file_name = f"{sanitize_filename(api)}-{sanitize_filename(domain)}.json"
return os.path.join("/data", file_name)
def build_query(args: argparse.Namespace, start_ym: YearMonth, end_ym: YearMonth) -> Dict[str, object]:
query: Dict[str, object] = {
"start_date": start_ym.to_string(),
"end_date": end_ym.to_string(),
}
if args.main_domain_only:
query["main_domain_only"] = True
if args.api in COUNTRY_REQUIRED_APIS:
query["country"] = args.country
query["granularity"] = args.granularity
elif args.api == "traffic-by-country":
query["limit"] = args.limit
return query
def import_api_client():
sys.path.insert(0, RUNTIME_PATH)
try:
from data_api import ApiClient # type: ignore
except Exception as exc: # pragma: no cover
raise RuntimeError(
"data_api import failed. Ensure runtime is mounted to /opt/.manus/.sandbox-runtime"
) from exc
return ApiClient
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Call SimilarWeb APIs using ApiClient inside Docker and persist output JSON."
)
parser.add_argument("--api", choices=sorted(API_MAP.keys()))
parser.add_argument("--domain")
parser.add_argument("--start-date")
parser.add_argument("--end-date")
parser.add_argument("--country", default="world")
parser.add_argument("--granularity", default="monthly")
parser.add_argument("--limit", type=int, default=10)
parser.add_argument("--main-domain-only", action="store_true")
parser.add_argument("--output")
parser.add_argument("--dry-run", action="store_true")
parser.add_argument("--mock-result", action="store_true")
parser.add_argument("--self-test", action="store_true")
return parser.parse_args()
def write_payload(path: str, payload: Dict[str, object]) -> None:
parent = os.path.dirname(path)
if parent:
os.makedirs(parent, exist_ok=True)
with open(path, "w", encoding="utf-8") as f:
json.dump(payload, f, ensure_ascii=False, indent=2)
f.write("\n")
def run() -> int:
args = parse_args()
if args.self_test:
result = {
"ok": True,
"runtime_path": RUNTIME_PATH,
"runtime_exists": os.path.isdir(RUNTIME_PATH),
"python_version": sys.version.split()[0],
}
print(json.dumps(result, ensure_ascii=False))
return 0
if not args.api or not args.domain:
raise ValueError("--api and --domain are required unless --self-test is used")
if args.limit < 1 or args.limit > 10:
raise ValueError("--limit must be between 1 and 10")
start_ym, end_ym = default_date_range(args.api, args.start_date, args.end_date)
validate_range(args.api, start_ym, end_ym)
endpoint = API_MAP[args.api]
query = build_query(args, start_ym, end_ym)
output_path = resolve_output_path(args.api, args.domain, args.output)
request_meta = {
"api": args.api,
"endpoint": endpoint,
"domain": args.domain,
"query": query,
"output": output_path,
"dry_run": bool(args.dry_run),
"mock_result": bool(args.mock_result),
}
if args.dry_run:
print(json.dumps({"ok": True, "request": request_meta}, ensure_ascii=False))
return 0
if args.mock_result:
payload = {
"request": request_meta,
"result": {
"source": "mock",
"message": "mock_result enabled",
},
}
write_payload(output_path, payload)
print(json.dumps({"ok": True, "output": output_path, "mode": "mock"}, ensure_ascii=False))
return 0
ApiClient = import_api_client()
client = ApiClient()
result = client.call_api(endpoint, path_params={"domain": args.domain}, query=query)
payload = {"request": request_meta, "result": result}
write_payload(output_path, payload)
print(json.dumps({"ok": True, "output": output_path, "endpoint": endpoint}, ensure_ascii=False))
return 0
if __name__ == "__main__":
try:
raise SystemExit(run())
except Exception as exc:
print(json.dumps({"ok": False, "error": str(exc)}, ensure_ascii=False), file=sys.stderr)
raise SystemExit(1)

查看文件

@@ -0,0 +1,38 @@
#!/usr/bin/env bash
set -euo pipefail
usage() {
cat <<'EOF'
Usage:
install_runtime_adapter.sh [target_dir]
Default target_dir:
/opt/.manus/.sandbox-runtime
Installs:
data_api.py
from this skill into the target runtime directory.
EOF
}
if [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
usage
exit 0
fi
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SRC="$SCRIPT_DIR/runtime/data_api.py"
TARGET_DIR="${1:-/opt/.manus/.sandbox-runtime}"
TARGET="$TARGET_DIR/data_api.py"
if [[ ! -f "$SRC" ]]; then
echo "Source file missing: $SRC" >&2
exit 1
fi
mkdir -p "$TARGET_DIR"
cp -f "$SRC" "$TARGET"
chmod 755 "$TARGET"
echo "Installed runtime adapter:"
echo " $TARGET"

查看文件

@@ -0,0 +1,128 @@
#!/usr/bin/env bash
set -euo pipefail
usage() {
cat <<'EOF'
Usage:
run_in_docker.sh [runner options] -- [entrypoint args]
Runner options:
--build Build image before running
--image <name> Override image name (default: codex/similarweb-analytics:latest)
--runtime-dir <path> Host path that contains data_api.py (default: /opt/.manus/.sandbox-runtime)
--output-dir <path> Host output directory mounted to /data (default: ./similarweb-output)
--network <mode> Docker network mode (default: bridge)
-h, --help Show this message
Entrypoint args:
--self-test
--api <global-rank|visits-total|unique-visit|bounce-rate|traffic-sources-desktop|traffic-sources-mobile|traffic-by-country>
--domain <domain>
--start-date YYYY-MM
--end-date YYYY-MM
--country <country>
--granularity monthly
--limit <1..10>
--main-domain-only
--output /data/<file>.json
--dry-run
--mock-result
Examples:
run_in_docker.sh --build -- --self-test
run_in_docker.sh -- --api visits-total --domain amazon.com --dry-run
run_in_docker.sh -- --api global-rank --domain amazon.com --output /data/amazon-rank.json
EOF
}
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
IMAGE="${SIMILARWEB_IMAGE:-codex/similarweb-analytics:latest}"
RUNTIME_DIR="${SIMILARWEB_RUNTIME_DIR:-/opt/.manus/.sandbox-runtime}"
OUTPUT_DIR="${SIMILARWEB_OUTPUT_DIR:-$PWD/similarweb-output}"
NETWORK_MODE="${SIMILARWEB_NETWORK_MODE:-bridge}"
BUILD_IMAGE=0
while [[ $# -gt 0 ]]; do
case "$1" in
--build)
BUILD_IMAGE=1
shift
;;
--image)
IMAGE="${2:-}"
shift 2
;;
--runtime-dir)
RUNTIME_DIR="${2:-}"
shift 2
;;
--output-dir)
OUTPUT_DIR="${2:-}"
shift 2
;;
--network)
NETWORK_MODE="${2:-}"
shift 2
;;
--)
shift
break
;;
-h|--help)
usage
exit 0
;;
*)
echo "Unknown runner option: $1" >&2
usage >&2
exit 2
;;
esac
done
if [[ $# -eq 0 ]]; then
echo "Missing entrypoint args. Use -- to pass container args." >&2
usage >&2
exit 2
fi
if ! command -v docker >/dev/null 2>&1; then
echo "docker command not found" >&2
exit 127
fi
if [[ ! -d "$RUNTIME_DIR" ]]; then
echo "Runtime dir not found: $RUNTIME_DIR" >&2
echo "It must contain data_api.py for real API calls." >&2
exit 1
fi
if [[ ! -f "$RUNTIME_DIR/data_api.py" ]]; then
echo "Runtime module missing: $RUNTIME_DIR/data_api.py" >&2
exit 1
fi
mkdir -p "$OUTPUT_DIR"
# Keep container non-root while ensuring mounted output path is writable.
chmod 0777 "$OUTPUT_DIR" 2>/dev/null || true
if [[ "$BUILD_IMAGE" -eq 1 ]] || ! docker image inspect "$IMAGE" >/dev/null 2>&1; then
docker build -t "$IMAGE" -f "$SCRIPT_DIR/docker/Dockerfile" "$SCRIPT_DIR/docker"
fi
docker run --rm \
--network "$NETWORK_MODE" \
--read-only \
--tmpfs /tmp:rw,noexec,nosuid,size=64m \
--tmpfs /var/tmp:rw,noexec,nosuid,size=32m \
--cap-drop ALL \
--security-opt no-new-privileges \
--pids-limit 256 \
--memory 512m \
--cpus 1.0 \
-e SIMILARWEB_API_KEY \
-e SIMILARWEB_BASE_URL \
-e RAPIDAPI_KEY \
-e RAPIDAPI_SIMILARWEB_HOST \
-v "$RUNTIME_DIR:/opt/.manus/.sandbox-runtime:ro" \
-v "$OUTPUT_DIR:/data:rw" \
"$IMAGE" "$@"

查看文件

@@ -0,0 +1,166 @@
#!/usr/bin/env python3
"""Minimal ApiClient runtime for SimilarWeb skill.
Implements the subset of Manus-style interface used by the skill:
ApiClient().call_api(api_name, path_params={"domain": ...}, query={...})
Primary mode: Similarweb official API (requires SIMILARWEB_API_KEY)
Fallback mode: RapidAPI similarweb13 domain snapshot (requires RAPIDAPI_KEY)
"""
from __future__ import annotations
import json
import os
import urllib.parse
import urllib.request
from dataclasses import dataclass
from typing import Any, Dict, Mapping, Optional
class ApiError(RuntimeError):
pass
@dataclass
class EndpointSpec:
path: str
OFFICIAL_ENDPOINTS: Dict[str, EndpointSpec] = {
"SimilarWeb/get_global_rank": EndpointSpec("/v1/website/{domain}/global-rank/global-rank"),
"SimilarWeb/get_visits_total": EndpointSpec("/v1/website/{domain}/total-traffic-and-engagement/visits"),
"SimilarWeb/get_unique_visit": EndpointSpec("/v1/website/{domain}/deduplicated-audience/deduplicated-audience"),
"SimilarWeb/get_bounce_rate": EndpointSpec("/v1/website/{domain}/total-traffic-and-engagement/bounce-rate"),
"SimilarWeb/get_traffic_sources_desktop": EndpointSpec("/v1/website/{domain}/traffic-sources/desktop"),
"SimilarWeb/get_traffic_sources_mobile": EndpointSpec("/v1/website/{domain}/traffic-sources/mobile-web"),
"SimilarWeb/get_total_traffic_by_country": EndpointSpec("/v1/website/{domain}/geography/total-traffic-and-engagement"),
}
class ApiClient:
def __init__(
self,
*,
similarweb_api_key: Optional[str] = None,
similarweb_base_url: Optional[str] = None,
rapidapi_key: Optional[str] = None,
rapidapi_host: Optional[str] = None,
timeout: int = 30,
) -> None:
self.similarweb_api_key = similarweb_api_key or os.getenv("SIMILARWEB_API_KEY")
self.similarweb_base_url = (
similarweb_base_url
or os.getenv("SIMILARWEB_BASE_URL")
or "https://api.similarweb.com"
).rstrip("/")
self.rapidapi_key = rapidapi_key or os.getenv("RAPIDAPI_KEY")
self.rapidapi_host = rapidapi_host or os.getenv("RAPIDAPI_SIMILARWEB_HOST") or "similarweb13.p.rapidapi.com"
self.timeout = timeout
def call_api(
self,
api_name: str,
*,
path_params: Optional[Mapping[str, Any]] = None,
query: Optional[Mapping[str, Any]] = None,
) -> Dict[str, Any]:
path_params = dict(path_params or {})
query = dict(query or {})
domain = str(path_params.get("domain", "")).strip()
if not domain:
raise ApiError("path_params.domain is required")
if self.similarweb_api_key:
return self._call_official(api_name, domain=domain, query=query)
if self.rapidapi_key:
return self._call_rapidapi_snapshot(api_name, domain=domain, query=query)
raise ApiError(
"No credentials configured. Set SIMILARWEB_API_KEY (preferred) or RAPIDAPI_KEY."
)
def _call_official(self, api_name: str, *, domain: str, query: Dict[str, Any]) -> Dict[str, Any]:
spec = OFFICIAL_ENDPOINTS.get(api_name)
if not spec:
raise ApiError(f"Unsupported api_name for official mode: {api_name}")
path = spec.path.format(domain=domain)
q = self._clean_query(query)
q["api_key"] = self.similarweb_api_key
url = f"{self.similarweb_base_url}{path}?{urllib.parse.urlencode(q)}"
req = urllib.request.Request(url=url, method="GET")
return self._do_request(req, mode="official", api_name=api_name, url=url)
def _call_rapidapi_snapshot(self, api_name: str, *, domain: str, query: Dict[str, Any]) -> Dict[str, Any]:
encoded_domain = urllib.parse.quote(domain)
url = f"https://{self.rapidapi_host}/v2/getdomain?domain={encoded_domain}"
headers = {
"x-rapidapi-key": self.rapidapi_key or "",
"x-rapidapi-host": self.rapidapi_host,
}
req = urllib.request.Request(url=url, method="GET", headers=headers)
resp = self._do_request(req, mode="rapidapi", api_name=api_name, url=url)
return {
"_adapter": {
"mode": "rapidapi",
"note": "Using /v2/getdomain snapshot fallback; not 1:1 with official endpoint schema.",
"requested_api": api_name,
"requested_query": query,
},
"data": resp,
}
@staticmethod
def _clean_query(query: Mapping[str, Any]) -> Dict[str, Any]:
out: Dict[str, Any] = {}
for k, v in query.items():
if v is None:
continue
if isinstance(v, bool):
out[k] = "true" if v else "false"
else:
out[k] = str(v)
return out
def _do_request(self, req: urllib.request.Request, *, mode: str, api_name: str, url: str) -> Dict[str, Any]:
try:
with urllib.request.urlopen(req, timeout=self.timeout) as resp:
body = resp.read().decode("utf-8", errors="replace")
data = json.loads(body) if body else {}
return {
"_meta": {
"mode": mode,
"api_name": api_name,
"http_status": resp.status,
"url": url,
},
"response": data,
}
except urllib.error.HTTPError as exc:
body = exc.read().decode("utf-8", errors="replace")
try:
parsed = json.loads(body)
except Exception:
parsed = {"raw": body}
raise ApiError(
json.dumps(
{
"http_status": exc.code,
"mode": mode,
"api_name": api_name,
"url": url,
"error": parsed,
},
ensure_ascii=False,
)
)
except urllib.error.URLError as exc:
raise ApiError(f"Network error for {url}: {exc}")
__all__ = ["ApiClient", "ApiError"]

查看文件

@@ -0,0 +1,40 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
RUNTIME_FIXTURE_DIR="$SCRIPT_DIR/tests/fixtures"
OUTPUT_DIR="${1:-$SCRIPT_DIR/../tmp/test-output}"
RUNNER="$SCRIPT_DIR/run_in_docker.sh"
mkdir -p "$OUTPUT_DIR"
echo "[1/4] Build image + self-test"
"$RUNNER" --build --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- --self-test
echo "[2/4] Dry-run validation"
"$RUNNER" --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- \
--api visits-total \
--domain amazon.com \
--country world \
--dry-run
echo "[3/4] Mock call writes output file"
"$RUNNER" --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- \
--api global-rank \
--domain amazon.com \
--mock-result \
--output /data/mock-global-rank.json
test -f "$OUTPUT_DIR/mock-global-rank.json"
echo "[4/4] Fixture ApiClient end-to-end call writes output"
"$RUNNER" --runtime-dir "$RUNTIME_FIXTURE_DIR" --output-dir "$OUTPUT_DIR" -- \
--api traffic-by-country \
--domain amazon.com \
--start-date 2025-12 \
--end-date 2026-02 \
--limit 3 \
--output /data/fixture-traffic-by-country.json
test -f "$OUTPUT_DIR/fixture-traffic-by-country.json"
echo "All tests passed. Output dir: $OUTPUT_DIR"

查看文件

@@ -0,0 +1,8 @@
class ApiClient:
def call_api(self, api_name, path_params=None, query=None):
return {
"fixture": True,
"api_name": api_name,
"path_params": path_params or {},
"query": query or {},
}