-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Open
Labels
Description
Background & Motivation
We propose adding a url() scalar function that enables executing HTTP/HTTPS requests directly from SQL queries. This function would allow users to:
- Send webhook notifications (e.g., Slack, Discord) triggered by query results
- Integrate with external REST APIs for data enrichment
- Enable real-time event-driven workflows from within SQL
Use Cases
- Alerting: Send Slack notifications when anomaly detection queries find issues
- Data Enrichment: Call external APIs to augment query results
- Webhook Integration: Trigger external workflows based on data changes
Proposed Interface
-- Basic GET request
SELECT url('https://api.example.com/data');
-- With configuration (method, headers, body)
SELECT url(
'https://hooks.slack.com/services/xxx',
'{"method": "POST", "headers": {"Content-Type": "application/json"}, "body": {"text": "Hello"}}'
);Return Value
- Returns
VARCHARcontaining the HTTP response body - Returns JSON object with
status,body,headersfor detailed response handling
Function Signature Options
-- Option A: Single function with JSON config
url(url VARCHAR, config VARCHAR) -> VARCHAR
-- Option B: Separate functions per method
http_get(url VARCHAR, headers VARCHAR) -> VARCHAR
http_post(url VARCHAR, body VARCHAR, headers VARCHAR) -> VARCHARDiscussion needed: Which interface style is preferred?
Security Considerations
This function introduces potential SSRF (Server-Side Request Forgery) risks. After investigating other systems:
| Database | Approach |
|---|---|
| ClickHouse | remote_url_allow_hosts configuration |
| Snowflake | Network Rules + External Access Integration |
| Databricks | Network Policies with Allowed Domains (FQDN) |
| DuckDB | enable_external_access variable |
| PostgreSQL (pgsql-http) | Function-level permission control |
Proposed Security Approach
Phase 1 (Initial Implementation):
- Admin-configurable variable to enforce HTTPS with certificate validation
- Rely on infrastructure-level controls (Security Groups, Network Firewalls) for host restrictions
Future Consideration:
- Connection/Integration object similar to Databricks (if needed for SaaS/managed deployments)
Open Questions
- Response size limits?
- Timeout configuration?
- Rate limiting considerations?