Commit Graph

42 Commits

Author SHA1 Message Date
crapStone
044c684a47 Fix compression (#405)
closes #404

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/405
Co-authored-by: crapStone <me@crapstone.dev>
Co-committed-by: crapStone <me@crapstone.dev>
2024-11-25 12:21:55 +00:00
crapStone
85059aa46e cache pages (#403)
taken from https://codeberg.org/Codeberg/pages-server/pulls/301

Co-authored-by: Moritz Marquardt <git@momar.de>

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/403
Co-authored-by: crapStone <me@crapstone.dev>
Co-committed-by: crapStone <me@crapstone.dev>
2024-11-21 23:26:30 +00:00
ltdk
5b120f0488 Implement static serving of compressed files (#387)
This provides an option for #223 without fully resolving it. (I think.)

Essentially, it acts very similar to the `gzip_static` and similar options for nginx, where it will check for the existence of pre-compressed files and serve those instead if the client allows it. I couldn't find a pre-existing way to actually parse the Accept-Encoding header properly (admittedly didn't look very hard) and just implemented one on my own that should be fine.

This should hopefully not have the same DOS vulnerabilities as #302, since it relies on the existing caching system. Compressed versions of files will be cached just like any other files, and that includes cache for missing files as well.

The compressed files will also be accessible directly, and this won't automatically decompress them. So, if you have a `tar.gz` file that you access directly, it will still be downloaded as the gzipped version, although you will now gain the option to download the `.tar` directly and decompress it in transit. (Which doesn't affect the server at all, just the client's way of interpreting it.)

----

One key thing this change also adds is a short-circuit when accessing directories: these always return 404 via the API, although they'd try the cache anyway and go through that route, which was kind of slow. Adding in the additional encodings, it's going to try for .gz, .br, and .zst files in the worst case as well, which feels wrong. So, instead, it just always falls back to the index-check behaviour if the path ends in a slash or is empty. (Which is implicitly just a slash.)

----

For testing, I set up this repo: https://codeberg.org/clarfonthey/testrepo

I ended up realising that LFS wasn't supported by default with `just dev`, so, it ended up working until I made sure the files on the repo *didn't* use LFS.

Assuming you've run `just dev`, you can go directly to this page in the browser here: https://clarfonthey.localhost.mock.directory:4430/testrepo/
And also you can try a few cURL commands:

```shell
curl https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure
curl -H 'Accept-Encoding: gz' https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure | gunzip -
curl -H 'Accept-Encoding: br' https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure | brotli --decompress -
curl -H 'Accept-Encoding: zst' https://clarfonthey.localhost.mock.directory:4430/testrepo/ --verbose --insecure | zstd --decompress -
```

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/387
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: ltdk <usr@ltdk.xyz>
Co-committed-by: ltdk <usr@ltdk.xyz>
2024-09-29 21:00:54 +00:00
Peter Gerber
bc9111a05f Use correct timestamp format for Last-Modified header (#365)
HTTP uses GMT [1,2] rather than UTC as timezone for timestamps. However,
the Last-Modified header used UTC which confused at least wget.

Before, UTC was used:

$ wget --no-check-certificate -S --spider https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
...
  Last-Modified: Sun, 11 Sep 2022 08:37:42 UTC
...
Last-modified header invalid -- time-stamp ignored.
...

After, GMT is used:

$ wget --no-check-certificate -S --spider https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
...
  Last-Modified: Sun, 11 Sep 2022 08:37:42 GMT
...
(no last-modified-header-invalid warning)

[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Last-Modified
[2]: https://www.rfc-editor.org/rfc/rfc9110#name-date-time-formats

Fixes #364

---

Whatt I noticed is that the If-Modified-Since header isn't accepted (neither with GMT nor with UTC):

```
$ wget --header "If-Modified-Since: Sun, 11 Sep 2022 08:37:42 GMT" --no-check-certificate -S --spider https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
Spider mode enabled. Check if remote file exists.
--2024-07-15 23:31:41--  https://cb_pages_tests.localhost.mock.directory:4430/images/827679288a.jpg
Resolving cb_pages_tests.localhost.mock.directory (cb_pages_tests.localhost.mock.directory)... 127.0.0.1
Connecting to cb_pages_tests.localhost.mock.directory (cb_pages_tests.localhost.mock.directory)|127.0.0.1|:4430... connected.
WARNING: The certificate of ‘cb_pages_tests.localhost.mock.directory’ is not trusted.
WARNING: The certificate of ‘cb_pages_tests.localhost.mock.directory’ doesn't have a known issuer.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Allow: GET, HEAD, OPTIONS
  Cache-Control: public, max-age=600
  Content-Length: 124635
  Content-Type: image/jpeg
  Etag: "073af1960852e2a4ef446202c7974768b9881814"
  Last-Modified: Sun, 11 Sep 2022 08:37:42 GMT
  Referrer-Policy: strict-origin-when-cross-origin
  Server: pages-server
  Strict-Transport-Security: max-age=63072000; includeSubdomains; preload
  Date: Mon, 15 Jul 2024 21:31:42 GMT
Length: 124635 (122K) [image/jpeg]
Remote file exists
```

I would have expected a 304 (Not Modified) rather than a 200 (OK). I assume this is simply not supported and on production 304 is returned by a caching proxy in front of pages-server.

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/365
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Peter Gerber <peter@arbitrary.ch>
Co-committed-by: Peter Gerber <peter@arbitrary.ch>
2024-07-23 18:42:24 +00:00
Daniel Erat
69fb22a9e7 Avoid extra slashes in redirects with :splat (#308)
Remove leading slashes from captured portions of paths when
redirecting using splats. This makes a directive like
"/articles/*  /posts/:splat  302" behave as described in
FEATURES.md, i.e. "/articles/foo" now redirects to
"/posts/foo" rather than to "/posts//foo". Fixes #269.

This also changes the behavior of a redirect like
"/articles/*  /posts:splat  302". "/articles/foo" will now
redirect to "/postsfoo" rather than to "/posts/foo".

This change also fixes an issue where paths like
"/articles123" would be incorrectly matched by the above
patterns.

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/308
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Daniel Erat <dan@erat.org>
Co-committed-by: Daniel Erat <dan@erat.org>
2024-04-20 11:00:15 +00:00
Daniel Erat
9ffdc9d4f9 Refactor redirect code and add tests (#304)
Move repetitive code from Options.matchRedirects into a new
Redirect.rewriteURL method and add a new test file.

No functional changes are intended; this is in preparation
for a later change to address #269.

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/304
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Daniel Erat <dan@erat.org>
Co-committed-by: Daniel Erat <dan@erat.org>
2024-04-18 21:03:16 +00:00
Hoernschen
a6e9510c07 FIX blank internal pages (#164) (#292)
Hello 👋

since it affected my deployment of the pages server I started to look into the problem of the blank pages and think I found a solution for it:

1. There is no check if the file response is empty, neither in cache retrieval nor in writing of a cache. Also the provided method for checking for empty responses had a bug.
2. I identified the redirect response to be the issue here. There is a cache write with the full cache key (e. g. rawContent/user/repo|branch|route/index.html) happening in the handling of the redirect response. But the written body here is empty. In the triggered request from the redirect response the server then finds a cache item to the key and serves the empty body. A quick fix is the check for empty file responses mentioned in 1.
3. The decision to redirect the user comes quite far down in the upstream function. Before that happens a lot of stuff that may not be important since after the redirect response comes a new request anyway. Also, I suspect that this causes the caching problem because there is a request to the forge server and its error handling with some recursions happening before. I propose to move two of the redirects before "Preparing"
4. The recursion in the upstream function makes it difficult to understand what is actually happening. I added some more logging to have an easier time with that.
5. I changed the default behaviour to append a trailing slash to the path to true. In my tested scenarios it happened anyway. This way there is no recursion happening before the redirect.

I am not developing in go frequently and rarely contribute to open source -> so feedback of all kind is appreciated

closes #164

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/292
Reviewed-by: 6543 <6543@obermui.de>
Reviewed-by: crapStone <codeberg@crapstone.dev>
Co-authored-by: Hoernschen <julian.hoernschemeyer@mailbox.org>
Co-committed-by: Hoernschen <julian.hoernschemeyer@mailbox.org>
2024-02-26 22:21:42 +00:00
crapStone
7e80ade24b Add config file and rework cli parsing and passing of config values (#263)
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/263
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: crapStone <me@crapstone.dev>
Co-committed-by: crapStone <me@crapstone.dev>
2024-02-15 16:08:29 +00:00
crapStone
c1fbe861fe rename gitea to forge in html error messages (#287)
closes #286

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/287
Reviewed-by: Andreas Shimokawa <ashimokawa@noreply.codeberg.org>
Co-authored-by: crapStone <crapstone01@gmail.com>
Co-committed-by: crapStone <crapstone01@gmail.com>
2024-02-11 12:43:25 +00:00
crapStone
cbb2ce6d07 add go templating engine for error page and make errors more clear (#260)
ping #199
closes #213

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/260
Co-authored-by: crapStone <crapstone01@gmail.com>
Co-committed-by: crapStone <crapstone01@gmail.com>
2023-11-16 17:11:35 +00:00
video-prize-ranch
974229681f Initial redirects implementation (#148)
Adds basic support for `_redirects` files. It supports a subset of what IPFS supports: https://docs.ipfs.tech/how-to/websites-on-ipfs/redirects-and-custom-404s/

Example:
```
/redirect https://example.com/ 301
/another-redirect /page 301
/302  https://example.com/  302
/app/* /index.html 200
/articles/* /posts/:splat 301
```
301 redirect: https://video-prize-ranch.localhost.mock.directory:4430/redirect
SPA rewrite: https://video-prize-ranch.localhost.mock.directory:4430/app/path/path
Catch-all with splat: https://video-prize-ranch.localhost.mock.directory:4430/articles/path/path

Closes #46

Co-authored-by: video-prize-ranch <cb.8a3w5@simplelogin.co>
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/148
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: video-prize-ranch <video-prize-ranch@noreply.codeberg.org>
Co-committed-by: video-prize-ranch <video-prize-ranch@noreply.codeberg.org>
2023-03-30 21:36:31 +00:00
crystal
46316f9e2f Fix raw domain for branches with custom domains and index.html (#159)
fix #156
fix #157

Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/159
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: crystal <crystal@noreply.codeberg.org>
Co-committed-by: crystal <crystal@noreply.codeberg.org>
2023-02-11 03:12:42 +00:00
6543
272c7ca76f Fix xorm regressions by handle wildcard certs correctly (#177)
close #176

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/177
2023-02-11 01:26:21 +00:00
6543
d8d119b0b3 Fix Cache Bug (#178)
error io.EOF is gracefully end of file read.

so we don't need to cancel cache saving

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/178
2023-02-11 00:31:56 +00:00
6543
7b35a192bf Add cert store option based on sqlite3, mysql & postgres (#173)
Deprecate **pogreb**!

close #169

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/173
2023-02-10 03:00:14 +00:00
Gusted
8b1f497bc4 Allow to use certificate even if domain validation fails (#160)
- Currently if the canonical domain validations fails(either for
legitimate reasons or for bug reasons like the request to Gitea/Forgejo
failing) it will use main domain certificate, which in the case for
custom domains will warrant a security error as the certificate isn't
issued to the custom domain.
- This patch handles this situation more gracefully and instead only
disallow obtaining a certificate if the domain validation fails, so in
the case that a certificate still exists it can still be used even if
the canonical domain validation fails. There's a small side effect,
legitimate users that remove domains from `.domain` will still be able
to use the removed domain(as long as the DNS records exists) as long as
the certificate currently hold by pages-server isn't expired.
- Given the increased usage in custom domains that are resulting in
errors, I think it ways more than the side effect.
- In order to future-proof against future slowdowns of instances, add a retry mechanism to the domain validation function, such that it's more likely to succeed even if the instance is not responding.
- Refactor the code a bit and add some comments.

Co-authored-by: Gusted <postmaster@gusted.xyz>
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/160
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>
2023-02-10 01:38:15 +00:00
Gusted
513e79832a Use correct log level for CheckCanonicalDomain (#162)
- Currently any error generated by requesting the `.domains` file of a repository would be logged under the info log level, which isn't the correct log level when we exclude the not found error.
- Use warn log level if the error isn't the not found error.

Co-authored-by: Gusted <postmaster@gusted.xyz>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/162
Reviewed-by: Otto <otto@codeberg.org>
2023-01-22 18:52:21 +00:00
Gusted
f2f943c0d8 Remove unnecessary conversion (#139)
- Remove unnecessary type conversion.
- Enforce via CI

Co-authored-by: Gusted <williamzijl7@hotmail.com>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/139
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>
2022-11-15 16:15:11 +01:00
6543
6c63b66ce4 Refactor split long functions (#135)
we have big functions that handle all stuff ... we should split this into smaler chuncks so we could test them seperate and make clear cuts in what happens where

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/135
2022-11-12 20:43:44 +01:00
6543
b9966487f6 switch to std http implementation instead of fasthttp (#106)
close #100
close #109
close #113
close #28
close #63

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/106
2022-11-12 20:37:20 +01:00
6543
91b54bef29
add newline 2022-11-07 23:09:41 +01:00
Gusted
091e6c8ed9 Add explicit logging in GetBranchTimestamp (#130)
- Logs are currently indicating that it's returning `nil` in valid
scenarios, therefor this patch adds extra logging in this code to
better understand what it is doing in this function.

Co-authored-by: Gusted <williamzijl7@hotmail.com>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/130
Reviewed-by: 6543 <6543@obermui.de>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>
2022-09-18 16:13:27 +02:00
6543
dc41a4caf4 Add Support to Follow Symlinks and LFS (#114)
close #79
close #80
close #91

Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/114
2022-08-12 06:40:12 +02:00
Gusted
876a53d9a2 Improve logging (#116)
- Actually log useful information at their respective log level.
- Add logs in hot-paths to be able to deep-dive and debug specific requests (see server/handler.go)
- Add more information to existing fields(e.g. the host that the user is visiting, this was noted by @fnetX).

Co-authored-by: Gusted <williamzijl7@hotmail.com>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/116
Reviewed-by: 6543 <6543@noreply.codeberg.org>
Co-authored-by: Gusted <gusted@noreply.codeberg.org>
Co-committed-by: Gusted <gusted@noreply.codeberg.org>
2022-08-12 05:06:26 +02:00
6543
8207586a48
just fix bcaceda711 2022-07-15 21:39:42 +02:00
6543
bcaceda711 dont cache if ContentLength greater fileCacheSizeLimit (#108)
Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/108
Reviewed-by: Otto <otto@codeberg.org>
2022-07-15 21:21:26 +02:00
6543
5411c96ef3 Tell fasthttp to not set "Content-Length: 0" on non cached content (#107)
fix #97

Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/107
2022-07-15 21:06:05 +02:00
6543
4c6164ef05 Propagate ETag from gitea (#93)
close #15

Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/93
2022-06-14 18:23:34 +02:00
crystal
38fb28f84f implement custom 404 pages (#81)
solves #56.

- The expected filename is `404.html`, like GitHub Pages
- Each repo/branch can have one `404.html` file at it's root
- If a repo does not have a `pages` branch, the 404.html file from the `pages` repository is used
- You get status code 404 (unless you request /404.html which returns 200)
- The error page is cached

---
close #56

Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/81
Reviewed-by: 6543 <6543@noreply.codeberg.org>
Co-authored-by: crystal <crystal@noreply.codeberg.org>
Co-committed-by: crystal <crystal@noreply.codeberg.org>
2022-06-12 03:50:00 +02:00
6543
02bd942b04 Move gitea api calls in own "client" package (#78)
continue #75
close #16
- fix regression (from #34) _thanks to @crystal_
- create own gitea client package
- more logging
- add mock impl of CertDB

Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: crystal <crystal@noreply.codeberg.org>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/78
Reviewed-by: crapStone <crapstone@noreply.codeberg.org>
2022-06-11 23:02:06 +02:00
6543
6f12f2a8e4
fix bug 2022-05-15 22:36:12 +02:00
6543
4267d54a63 refactor (2) (#34)
move forward with refactoring:
 - initial implementation of a smal "gitea client for fasthttp"
 - move constant into const.go

Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/34
Reviewed-by: Otto Richter <otto@codeberg.org>
2022-04-20 23:42:01 +02:00
6543
f5d0dc7447 Add pipeline (#65)
close #54

Co-authored-by: 6543 <6543@obermui.de>
Reviewed-on: https://codeberg.org/Codeberg/pages-server/pulls/65
Reviewed-by: Andreas Shimokawa <ashimokawa@noreply.codeberg.org>
2022-03-27 21:54:06 +02:00
6543
6af6523a0f
code format 2021-12-09 20:16:43 +01:00
6543
5aae7c882f
Merge branch 'master' into refactoring 2021-12-05 22:50:46 +01:00
6543
a7bb3448a4
move more args of Upstream() to upstream Options 2021-12-05 19:53:23 +01:00
6543
e85f21ed2e
some renames 2021-12-05 18:20:38 +01:00
6543
77e39b2213
unexport if posible 2021-12-05 16:24:26 +01:00
6543
e6198e4ddd
start refactor Upstream func 2021-12-05 15:59:43 +01:00
6543
de4706bf58
rm 2rm 2021-12-05 15:53:46 +01:00
6543
ccada3e6df
split cert func to related packages 2021-12-05 15:21:05 +01:00
6543
38426c26db
move upstream into own package 2021-12-05 14:48:52 +01:00