Skip to content

fix: filter out bad timestamp issue activities to prevent overflows#3693

Open
epipav wants to merge 3 commits intomainfrom
bugfix/tb-issue-analysis-filtering-old-dates-out
Open

fix: filter out bad timestamp issue activities to prevent overflows#3693
epipav wants to merge 3 commits intomainfrom
bugfix/tb-issue-analysis-filtering-old-dates-out

Conversation

@epipav
Copy link
Collaborator

@epipav epipav commented Dec 12, 2025

We have 1970-dated issue activities that cause overflows when calculating closedInSeconds and `respondedInSeconds. We now filter these activities out in the issue analysis copy pipe


Note

Low Risk
Low risk: adds a simple timestamp-year filter in the issue_analysis_copy_pipe to prevent overflow errors, with the main impact being that some malformed historical records will no longer contribute to issue metrics.

Overview
Filters out issue activity rows with invalid/epoch-like timestamps by adding toYear(timestamp) >= 1971 constraints to the issues-opened, issues-closed, and issue-comment queries in issue_analysis_copy_pipe.pipe.

This prevents closedInSeconds/respondedInSeconds calculations from overflowing when bad 1970-dated events are present, at the cost of excluding those records from the issues_analyzed datasource.

Written by Cursor Bugbot for commit 444beef. This will update automatically on new commits. Configure here.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

1 similar comment
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@epipav epipav requested a review from joanagmaia December 12, 2025 08:56
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

1 similar comment
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

joanagmaia
joanagmaia previously approved these changes Dec 12, 2025
Copy link
Contributor

@joanagmaia joanagmaia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✅

Do you think this might also be an issue for pull requests analysis data? So basically in all widgets where we rely on the timestamps and where we need to make calculations like resolvedAt, mergedAt, shouldn't we always exclude these activities? 🤔
Everywhere else where we have totals we should still include them

@joanagmaia
Copy link
Contributor

Hey @epipav I believe this is already in production, can we merge this one?

Copilot AI review requested due to automatic review settings March 23, 2026 17:21
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Tinybird issue analysis copy pipe to drop obviously-bad (pre-1971) issue activity timestamps that can lead to incorrect/overflowing duration calculations in downstream metrics.

Changes:

  • Filters issues-opened and issues-closed activities to toYear(timestamp) >= 1971 to exclude 1970-era bad data.
  • Normalizes the pipe TYPE value and adjusts whitespace/formatting in the pipe file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -1,5 +1,6 @@
DESCRIPTION >
Compacts activities from same issue into one, keeping necessary information in a single row. Helps to serve issue-wide widgets in the development tab.
Compacts activities from same issue into one, keeping necessary information in a single row. Helps to serve issue-wide widgets in the development tab.
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 2 uses a tab indentation under DESCRIPTION >, while other Tinybird pipes in this repo use spaces (e.g., pull_request_analysis_copy_pipe.pipe). Tabs can make diffs noisy and may be interpreted differently by tooling—please switch this to consistent space indentation.

Suggested change
Compacts activities from same issue into one, keeping necessary information in a single row. Helps to serve issue-wide widgets in the development tab.
Compacts activities from same issue into one, keeping necessary information in a single row. Helps to serve issue-wide widgets in the development tab.

Copilot uses AI. Check for mistakes.
Comment on lines 16 to +17
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE type = 'issues-opened'
WHERE type = 'issues-opened' AND toYear(timestamp) >= 1971
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new toYear(timestamp) >= 1971 predicate is non-sargable in ClickHouse/Tinybird and can prevent partition/primary-key pruning. Prefer a direct range filter like timestamp >= toDateTime('1971-01-01 00:00:00') (or toDate('1971-01-01') if timestamp is a Date) to keep the filter efficient.

Copilot uses AI. Check for mistakes.
Comment on lines 25 to 27
FROM activityRelations_deduplicated_cleaned_bucket_union
WHERE type = 'issues-closed' AND sourceParentId != ''
WHERE type = 'issues-closed' AND sourceParentId != '' AND toYear(timestamp) >= 1971
GROUP BY sourceParentId
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: toYear(timestamp) >= 1971 is likely to be less efficient than a direct timestamp range predicate. Consider switching this to timestamp >= toDateTime('1971-01-01 00:00:00') for better query pruning.

Copilot uses AI. Check for mistakes.
LEFT JOIN issues_comment AS comment ON opened.sourceId = comment.sourceParentId

TYPE COPY
TYPE copy
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TYPE copy deviates from the convention used across the Tinybird pipes in this repo (TYPE COPY). If Tinybird parsing or internal validation is case-sensitive (or if tooling expects the canonical form), this could break deployments—please keep it consistent as TYPE COPY.

Suggested change
TYPE copy
TYPE COPY

Copilot uses AI. Check for mistakes.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

@epipav epipav force-pushed the bugfix/tb-issue-analysis-filtering-old-dates-out branch from 090ae14 to 9d1ab68 Compare March 23, 2026 17:26
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

1 similar comment
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants