Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATE columns set values with timezone offsets, producing conflicts when db server is on a different timezone. #462

Open
janigro opened this issue Feb 25, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@janigro
Copy link

janigro commented Feb 25, 2024

Problem description

First I should say that I read issue #453, although I think this one is slightly different.

When saving a date to a DATE column, the INSERT specifies a timezone offset. This is not usually a problem, if we are in the same timezone as the db server. But working with a server on America/New_York, inserting 2000-01-01 00:00:00+00:00 will actually insert 1999-12-31 (as in New York, it is not 2000-01-01 yet).

This behaviour differs from MySQL, which handles (to my understanding) this case correctly. Inspecting the statement logs in both servers, there is a clear difference in the statement they produce (bindings where substituted by their values for clarity):

MySQL:    INSERT INTO `friends` (`id`, `name`, `birthday`) VALUES (DEFAULT, 'Paul', '2000-01-01 00:00:00')
Postgres: INSERT INTO "friends" ("id", "name", "birthday") VALUES (DEFAULT, 'Paul', '1999-12-31 19:00:00-05') RETURNING "id"

In MySQL, we can ignore the server timezone, and simply use UTC. This guarantees the date stored is the one we will fetch. With Postgres however, we need to be aware of the server timezone at all times, in order to create the Date objects to be inserted, using the server's timezone.

I cannot see the reasoning behind this (I'm sure there's one, I just don't see it).

To Reproduce

A clear example for using the DATE column type, is for storing birthday dates. The time is completely irrelevant and even less is its timezone.

final class Friend: Model, Content {
    static let schema = "friends"

    @ID(custom: "id") public var id: Int?

    @Field(key: "name") public var name: String
    @Field(key: "birthday") public var birthday: Date

    init() { }
    
    init(id: Int? = nil, name: String, birthday: Date) {
        self.id = id
        self.name = name
        self.birthday = birthday
    }
}

extension Friend {
    struct Migration: Fluent.AsyncMigration {
        
        func prepare(on database: Database) async throws {
            
            try await database.schema(Friend.schema)
                .field(.id, .int, .identifier(auto: true))
                .field("name", .string, .required)
                .field("birthday", .date, .required)
                .create()
        }
        
        func revert(on database: Database) async throws {
            try await database.schema(Friend.schema).delete()
        }
    }
}
let dateFormatter = DateFormatter()
dateFormatter.dateFormat = "yyyy-MM-dd"
dateFormatter.timeZone = TimeZone(secondsFromGMT: 0)

try await Friend(name: "Paul", 
                 birthday: dateFormatter.date(from: "2000-01-01")!).save(on: db)

For this to work correctly on Postgres, we need to know the server's timezone and use it to create the Date object.

dateFormatter.timeZone = .init(identifier: "America/New_York")

Expected behavior

I would think that the result should be the same, no matter which database we are connecting to. I absolutely prefer MySQL's behavior.

If for any reason this is not possible, would it be too crazy/weird to ask that defining the Model property as String (instead of Date) would work with DATE column types too? That way we could do:

try await Friend(name: "Paul", birthday: "2000-01-01").save(on: db)

and that would translate into:

INSERT INTO "friends" ("id", "name", "birthday") VALUES (DEFAULT, 'Paul', '2000-01-01') RETURNING "id"

Another and better solution could be having a different swift type altogether (e.g., DateWithoutTime) and get rid of all ambiguity.

In my case, I can afford to do ALTER DATABASE SET TIMEZONE to 'UTC' and forget about it... but that may not always be possible.

Environment

framework: 4.77.0
toolbox: 18.7.4
postgres-nio: 1.17.0

@janigro janigro added the bug Something isn't working label Feb 25, 2024
@gwynne
Copy link
Member

gwynne commented Feb 25, 2024

I believe this is ultimately a side effect of your use of .date as the field type used for the birthday field in the Fluent migration - the Fluent Postgres driver maps .date to Postgres's DATE type. However, the Date encoding logic in PostgresNIO unconditionally sends Date values to the server encoded as binary-format TIMESTAMP WITH TIME ZONE (aka timestamptz) values. Although the database type says "with time zone", the format is actually a count of seconds since 2000-01-01T00:00:00Z, always in UTC. Postgres then performs the appropriate time zone adjustment when implicitly coercing the timestamptz input to the column's date type. The reverse does not take place when the values are loaded from the database because Postgres will send it as a DATE (reasonably enough), which does not include the time zone offset needed to perform the inverse adjustment (nor is there, for that matter, any reliable indicator that such an adjustment is necessary by that point).

In short, Fluent and PostgresNIO both act, of necessity, on the assumption that Dates are associated with TIMESTAMPTZ columns (in your migration, change .date to .datetime). In fact, it is safe to say as a general statement that Fluent is not guaranteed to handle either DATE or TIME correctly regardless of which database driver one is using (for example, with MySQL, TIME columns just flat-out don't work at all).

Note

If you really do need a DATE column rather than a TIMESTAMPTZ for your use case (I have found this to be rare in practice, but YMMV of course), you have a few options:

  1. Use SQLKit , and explicitly specify the column in the SQL as "birthday"::"timestamptz", so that Postgres performs the reverse conversion for you. (I dare say this would rather defeat the purpose of the question, however.)
  2. Sidestep the assumptions made by the Date-based logic by using one of the alternate TimestampFormats available with the @Timestamp property wrapper - the ISO8601TimestampFormat would work nicely if you tweaked it to allow you configure it with the database's time zone (getting that info out of the database is left as an exercise for the reader 😅).
  3. The second solution ignores that @Timestamp would be better named @OptionalTimestamp, since it requires the value be Date? rather than just Date. This in turn is easily fixed: write a separate wrapper that doesn't require that, as shown by this sample code. This is working code that can be dropped into your project and used as-is (except that you will have to delete the last line of the file). I've used it without issue in production systems; the only reason I never included it in Fluent is it's not a common enough need to be worth adding even more confusion, especially when its naming is inconsistent 😅.

If none of those works for you, I'm afraid you're out of luck - we can't even look into changing Fluent's existing behavior in this respect unless we can also figure out a way to handle it better in PostgresNIO first (and even then the changes would end up happening in Fluent 5).

@janigro
Copy link
Author

janigro commented Feb 25, 2024

Thank you for lightning-fast response. I now better understand where the issue is, and also recognize that changing Fluent 4 to make DATE (i.e., .date) work would also probably mean breaking existing applications that already found their own workarounds.

As to why the importance of DATE (you mention you find this rare in practice)... I agree, but only if we assume we create the schemas ourselves and have the power to choose the field type. Sometimes we need to work with existing databases for which we can only use DML, but no DDL.

In any case, I appreciate your very detailed reply. You have given me many options to work with! Thank you! 😀

@gwynne
Copy link
Member

gwynne commented Feb 25, 2024

Let's just say that this sort of issue, and the difficulty in addressing it in any helpful fashion even if Fluent 4's API could be more freely changed, is a notable entry on the long list of reasons that the very preliminary design work I've finally started for Fluent 5 centers around a much more SQLKit-like API than what we have now 😅.

Oh, and I forgot - there is one other option you could go with:

  • Define a new type - or, in fact, for this example, let's use Foundation's DateComponents type, which is the closest thing Swift or Foundation has to a "date-only" or "time-only" representation in any case. The intent is to import PostgresNIO, and conform the type (whether a custom one or not) to the PostgresDecodable and PostgresNonThrowingEncodable protocols. Those implementations can then do what the same conformances that exist for Date do not: recognize Postgres's PostgresDataType.date, .time, .timestamp, and .timestamptz types in PostgresDataFormat.binary format during decoding and choosing the most appropriate one to use during encoding. (Be careful - these are NOT the same as Fluent's DatabaseSchema.DataType.time, .date, .datetime etc. types; the Postgres ones directly refer to Postgres type OIDs, as exchanged via the wire protocol.) How you would handle the time zone adjustment for .date. and .timestamp is up to you - you can't perform a query during encoding or decoding, nor can you add a typecast to the query (although you could tweak RequiredTimestampProperty's AnyDatabaseProperty conformance so that the DatabaseQuery.Values saved to DatabaseInput and read from DatabaseOutput used the .custom() case with an appropriate SQLExpression... but that gets even more complicated 😅). Once you have a type conforming to the PostgresCodable protocols, you can just use that type as the value of an @Field property, and Fluent (via FluentPostgresDriver, then PostgresKit) will eventually call your implementations at the appropriate times 😅. Have a look at Date+PostgresCodable.swift in PostgresNIO for an idea of what's needed.

(Or, you know, just use TIMESTAMPTZ 😂 😂 😂 But you're absolutely right that that isn't always a viable option.)

@gwynne
Copy link
Member

gwynne commented Feb 25, 2024

(At the end of the day, it's just another "ORMs are the reason anyone who doesn't work with either audio equipment or electronics has ever even heard of an impedance mismatch" thing, and the last-second choice to shoehorn support for MongoDB into Fluent 4 immediately before the GM release elevated the mismatch to what I'd call almost a flat-out incompatibility after 3 or 4 years of fighting with it 😆. At least I can blame that particular call on @tanner0101, I was against it at the time 🤣.)

@janigro
Copy link
Author

janigro commented Feb 25, 2024

I really like this new option. Probably too much for my project, because I can really go with TIMESTAMPTZ or simply change the timezone of the database. However, I still want to try PostgresDecodable, if only as an excercise to better understand the whole process. Thanks again! 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants