Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README to match pgai docs #7464

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

atovpeko
Copy link
Contributor

No description provided.

atovpeko and others added 5 commits November 20, 2024 10:47
…in-the-timescaledb-github-repo-to-match-the-pgai-docs' into 3269-docs-rfc-update-the-readme-in-the-timescaledb-github-repo-to-match-the-pgai-docs
Copy link
Contributor

@iroussos iroussos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @atovpeko, the core sections look great! I have added a few comments for you to consider

README.md Outdated
Comment on lines 11 to 22
<h3>TimescaleDB is an extension for PostgreSQL that enables time-series workloads while increasing ingest, query, storage, and analytics performance</h3>

## TimescaleDB
[![Docs](https://img.shields.io/badge/Read_the_Timescale_docs-black?style=for-the-badge&logo=readthedocs&logoColor=white)](https://docs.timescale.com/)
[![SLACK](https://img.shields.io/badge/Ask_the_Timescale_community-black?style=for-the-badge&logo=slack&logoColor=white)](https://timescaledb.slack.com/archives/C4GT3N90X)
[![Try TimescaleDB for free](https://img.shields.io/badge/Try_Timescale_for_free-black?style=for-the-badge&logo=timescale&logoColor=white)](https://console.cloud.timescale.com/signup)

TimescaleDB is an open-source database designed to make SQL scalable for
time-series data. It is engineered up from PostgreSQL and packaged as a
PostgreSQL extension, providing automatic partitioning across time and space
(partitioning key), as well as full SQL support.
</div>

TimescaleDB scales PostgreSQL for time-series data with the help of [hypertables](https://docs.timescale.com/use-timescale/latest/hypertables/about-hypertables/). Hypertables are PostgreSQL tables that automatically partition your data by time and space. You interact with a hypertable in the same way as regular PostgreSQL table. Behind the scenes, the database performs the work of setting up and maintaining the hypertable's partitions.

From the perspective of both use and management, TimescaleDB looks and feels like PostgreSQL, and can be managed and queried as
such. However, it provides a range of features and optimizations that make managing your time-series data easier and more efficient.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the messaging is not 100% consistent with how we talk about Timescale(DB) on timescale.com

I would propose to quickly coordinate with marketing as well on the messaging as we are now expanding to more use cases than pure time series and we also start talking about Hypercore, our hybrid row-columnar engine. As an example from the website:

Time series, events, and analytics

For ingesting and querying vast amounts of live data. Our hybrid row-columnar engine makes queries up to 350x faster, ingests 44% faster and reduces storage by 95% over RDS.

Copy link
Contributor

@iroussos iroussos Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that I don't include other use cases that we have on TImescale Cloud (like RAG, search, and Al) that are supported in other ways.

The additional events, and (real-time) analytics use cases that I mention on my previous comment are fully served by the TImescaleDB extension (this repo)

README.md Outdated Show resolved Hide resolved
- [About time buckets](https://docs.timescale.com/use-timescale/latest/time-buckets/about-time-buckets/)
- [API reference](https://docs.timescale.com/api/latest/hyperfunctions/time_bucket/)
- [All TimescaleDB features](https://docs.timescale.com/use-timescale/latest/)
- [Tutorials](https://docs.timescale.com/tutorials/latest/)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are missing a section about enabling Hypercore (compression) here - this is the most important feature after hypertables themselves, so we want all users to know about it and get started with enabling Hypercore from the get go

Copy link
Contributor Author

@atovpeko atovpeko Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Hypercore is released yet though? At least we don't have it in the docs. So for now it's either add and call it compression, or wait until hypercore is released?

[Timescale Cloud](https://tsdb.co/GitHubTimescale), a fully-managed TimescaleDB in the cloud, is
available via a free trial. Create a PostgreSQL database in the cloud with TimescaleDB pre-installed
so you can power your application with TimescaleDB without the management overhead.
Continuous aggregates are designed to make queries on very large datasets run faster. They use PostgreSQL [materialized views](https://www.postgresql.org/docs/current/rules-materializedviews.html) to continuously and incrementally refresh a query in the background, so that when you run the query, only the data that has changed needs to be computed, not the entire dataset.
Copy link
Contributor

@iroussos iroussos Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit for this section and the same intro on CAggs that we have on the docs now that I've seen it: Continuous aggregates are materialized vies, but they are not using them for keeping data in sync, etc. Normally materialized views can not be incrementally materialized, they have to be rebuilt from scratch every time you want to refresh it (yes going over 100s of GB or TBs of data in large scale use cases).

The whole mechanism to continuously and incrementally refresh them in the background is the special sauce of TImescale.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually CAggs don't use Postgres Materialized Views... in fact CAggs is our MatView implementation with steroids (Incremental Refresh, Data Retention, Compression/Columnar Storage) ... but we can´t say that it is a Materialized View because users will expect to see it under MatView sections on their favourite client applications connected to the database but actually they will apear under the View sections cause we expose it to users as a Postgres View.

Copy link
Contributor Author

@atovpeko atovpeko Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good approved description anywhere (website, blog) that I could reuse?


### Architecture documents
TimescaleDB tiered storage architecture includes a standard high-performance storage tier and a low-cost object storage tier. You can migrate rarely used data to the object storage to cut costs. Data is tiered on the level of chunks, that is, individual parts of tables. This means that a single table can be spread among storage tiers for ultimate cost optimization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to make it crystal clear that this is a feature only available on Timescale cloud, not accessible by any self hosted edition.

Same for the next section on HA - both are cloud only and are also not driven by code in this extension (other than some very light integration with the dedicated cloud native extension for Tiered Storage). HA is a pure infrastructure feature at the server level, built on top of Timescale (PostgreSQL) in general.

I would not include those, but if we really want to have them somewhere for some reason, I would put them at the end on a separate & dedicated section labeled as Cloud only features or something similar. But still seem out of place to me for this repository.

FYI @ramonguiu for your opinion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to replace these sections with one on Cloud features as an upsell? For example, the text in https://docs.timescale.com/getting-started/latest/? (You have to refresh to see the latest content)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO here we should focus only on features available in the extension. Cloud features should not be mentioned here to don't make community users confused.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A getting started section is the more valuable. Left a comment with an example above.


### Releases & updates
Timescale is a PostgreSQL database company. To learn more, visit [timescale.com](https://www.timescale.com).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check with marketing on how we talk about ourselves? This is also not the same way we talk about Timescale in https://www.timescale.com/about

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +1 to +17
<div align=center>
<picture align=center>
<source media="(prefers-color-scheme: dark)" srcset="https://assets.timescale.com/docs/images/timescale-logo-dark-mode.svg">
<source media="(prefers-color-scheme: light)" srcset="https://assets.timescale.com/docs/images/timescale-logo-light-mode.svg">
<img alt="Timescale logo" >
</picture>
</div>

<div align=center>

<h3>TimescaleDB is an extension for PostgreSQL that enables time-series workloads while increasing ingest, query, storage, and analytics performance</h3>

[![Docs](https://img.shields.io/badge/Read_the_Timescale_docs-black?style=for-the-badge&logo=readthedocs&logoColor=white)](https://docs.timescale.com/)
[![SLACK](https://img.shields.io/badge/Ask_the_Timescale_community-black?style=for-the-badge&logo=slack&logoColor=white)](https://timescaledb.slack.com/archives/C4GT3N90X)
[![Try TimescaleDB for free](https://img.shields.io/badge/Try_Timescale_for_free-black?style=for-the-badge&logo=timescale&logoColor=white)](https://console.cloud.timescale.com/signup)

## TimescaleDB
</div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is Markdown so why add HTML???

Copy link
Contributor Author

@atovpeko atovpeko Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am centering this section and making the images clickable - just like the template I am using for this: https://raw.githubusercontent.com/timescale/pgai/refs/heads/main/README.md. Not sure raw Markdown can handle this

README.md Outdated Show resolved Hide resolved

### Useful tools
# Ensure high availability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this section should not be added since it is a cloud feature not an extension feature.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ to Fabrizio's comment. HA is not a feature shipped with the extension, we can remove it.

Comment on lines -171 to -173
- [Slack Channel](https://slack.timescale.com)
- [Github Issues](https://github.com/timescale/timescaledb/issues)
- [Timescale Support](https://tsdb.co/GitHubTimescaleSupport): see support options (community & subscription)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove those links???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have moved to the top of the page

Comment on lines -181 to -184
### Contributing

- [Contributor instructions](https://github.com/timescale/timescaledb/blob/main/CONTRIBUTING.md)
- [Code style guide](https://github.com/timescale/timescaledb/blob/main/docs/StyleGuide.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove it??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have moved but are still on the page

Co-authored-by: Fabrízio de Royes Mello <[email protected]>
Signed-off-by: atovpeko <[email protected]>
> Status and recommendations:
> - **Users of [Timescale Cloud](https://console.cloud.timescale.com/) are unaffected**. We are currently not upgrading cloud databases to these latest minor PG releases. But regardless, Timescale Cloud recompiles TimescaleDB against each new minor Postgres version, which would prevent any such incompatibility.
> - **Users to Timescale's [k8s docker image](https://github.com/timescale/timescaledb-docker-ha) are unaffected**. We are currently not building a new release against these latest minor PG releases. But regardless, our docker image build process recompiles TimescaleDB against each new minor Postgres version, which would prevent any such incompatibility.
> - Users of other managed clouds (using TimescaleDB Apache-2 Edition) are recommended to not upgrade to these latest minor PG releases at this time, or discuss with their cloud provider how they build TimescaleDB with new minor releases.
> - Users who self-manage TimescaleDB are recommended to not upgrade to these latest minor PG releases at this time.
>
>
> We are working with the PG community about how best to address this issue. See [this thread on pgsql-hackers](https://www.postgresql.org/message-id/flat/CABOikdNmVBC1LL6pY26dyxAS2f%2BgLZvTsNt%3D2XbcyG7WxXVBBQ%40mail.gmail.com) for more info.
>
> Thanks for your understanding! 🙏
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a getting started section here, e.g.:

Getting started with TimescaleDB


### Architecture documents
TimescaleDB tiered storage architecture includes a standard high-performance storage tier and a low-cost object storage tier. You can migrate rarely used data to the object storage to cut costs. Data is tiered on the level of chunks, that is, individual parts of tables. This means that a single table can be spread among storage tiers for ultimate cost optimization.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A getting started section is the more valuable. Left a comment with an example above.


### Useful tools
# Ensure high availability
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ to Fabrizio's comment. HA is not a feature shipped with the extension, we can remove it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Docs RFC] Update the readme in the TimescaleDB github repo to match the pgai docs
5 participants