The instructions in README.md give unopinionated instructions on how to install and use this repository, which is appropriate to general users. This file contains instructions and details more appropriate to people who need to develop and improve the Katana system itself.
The quickest way to start hacking on Katana is to look at
scripts/setup_dev_ubuntu.sh
and use that as the basis for installing a
development environment on your own machine.
With those developer dependencies installed, you can run the following commands from the project root directory to build the system:
case $(uname) in
Darwin*)
;;
*) conan profile update settings.compiler.libcxx=libstdc++11 default
;;
esac
mkdir build
cd build
conan install ../config --build=missing
cmake -DCMAKE_TOOLCHAIN_FILE=conan_paths.cmake ..
make
Note: Katana builds with a variety of compilers as long as they support the C++17 standard. See README.md for specific tested versions.
This repository began life as a fork the Galois project.
As part of the commercialization process, this repository was combined with proprietary parts of the Katana system and later disentangled.
Disentangling the open-source and proprietary components required rewriting the git history of this repository.
Because of this, commits from 2020-06-30 until 2020-10-29 (marked with [REWRITTEN COMMIT: incomplete subtree]
) are unbuildable and inconsistent; they are included to provide context only.
Commits from 2020-10-29 until 2020-01-01 are consistent (marked with [REWRITTEN COMMIT: self-contained subtree]
), but may have removed changes mentioned in the commit messages.
The above instructions should work if you have installed the C++ library
dependencies in scripts/setup_dev_ubuntu.sh
(e.g., llvm-dev, arrow) in their
standard system locations (typically /usr/lib
or /usr/local/lib
). If you
need to tell cmake
about additional library locations, you can use the CMake
option CMAKE_PREFIX_PATH
, as in:
cmake -DCMAKE_TOOLCHAIN_FILE=conan_paths.cmake \
-DCMAKE_PREFIX_PATH=<path/to/cmakefiles/for/library>;<another/path> ..
As a sidenote, CMake toolchain file is simply a method for initially defining
CMAKE_PREFIX_PATH
and other CMake options. You can verify this by looking at
the contents of conan_paths.cmake
.
A common issue is that you have multiple versions of the same dependency, located in different directories, and CMake picks the wrong version.
The process by which CMake finds packages is involved, and the CMake documentation contains all the gory details. One implication, though, is that CMake adds directories in your path to its set of search locations.
Thus, if the LLVM C++ compiler (clang++) is in your path, CMake will attempt to use the LLVM support libraries (e.g., libLLVMSupport.a, libclang.so) associated with your compiler installation by default, even though your compiler and the version of the LLVM support libraries you use are not strictly related to each other.
You can work around this by putting the location of the LLVM support libraries
in CMAKE_PREFIX_PATH
because that takes precedence over locations in your
path. Alternatively, you can indicate the location of the LLVM libraries
directly with LLVM_DIR
:
cmake -DCMAKE_TOOLCHAIN_FILE=conan_paths.cmake \
-DLLVM_DIR="$(llvm-config-X --cmakedir)" ..
The way to contribute code to the Katana project is to make a fork of this repository and create a pull request with your desired changes. For a general overview of this process, this description from Atlassian.
If you are an external contributor (i.e., someone who does not have write access to this repo), the main reviewer of your pull request is responsible for finally merging your code after it has been reviewed.
If you are an internal contributor (e.g., a member of Katana Graph), you are responsible for merging your code after it has been reviewed.
When writing your change and during the review process, it is common to just append commits to your pull request branch, but prior to merging, these process commits should be reorganized with the interest of future developers in mind. This usually means structuring commits to be less about your process and more about the logical outcomes.
A common pattern is a refactoring commit followed by the actual new feature. Each commit is usually buildable and testable on its own but altogether they comprise the work done. You can look at the git history of this repo for inspiration, but if you have any doubts, squashing all your commits into one is usually a safe choice.
You can use git rebase --interactive
to reorganize your commits and git push --force
to update your branch. Depending on your IDE, there are various
features or plugins to streamline this process (e.g.,
vim-fugitive,
magit). Git also provides git commit --fixup
and
git rebase --autosquash
which can streamline this process.
Release versions of Katana are semantic versions with an release candidate tag
(rc#
) appended for release candidates.
Build/development versions of Katana are of the form:
[release].dev+N.0.######
.
- a release version
.dev
+
- the number of commits on the nearest major branch (e.g.,
master
orrelease/v*
) 0
- an abbreviated commit id
The 0
is a placeholder for the enterprise commit count in enterprise builds
(enterprise builds also add an enterprise commit hash). The commit hashes are
replaced with dirty
if the tree is dirty. For example,
0.1.10rc1.dev+40.0.04ac6f
.
The version number of a commit is computed from the content of the repository. The process is:
- If the commit has a tag in the form
v{version number}
(e.g.,v0.1.10rc1
) then that is the version regardless of other information. - Otherwise, the content of
config/version.txt
specifies the upcoming release and the version of the commit is computed based on the commit history and branches.
The main development branch master
should build and pass tests, but may be
unstable at times. Regressions may be introduced post-merge or a PR may be more
clearly expressed with intermediate commits that do not pass all tests. We
prefer a clear history over "atomic" commits. During our release process, we
specifically address stability and regression issues.
The tool scripts/version
performs a number of tasks related to version
management. Day to day uses are mentioned here and a full list of commands and
options is available from: scripts/version --help
(and --help
on each
subcommand).
To show the current version, run scripts/version show
. This will print out
the version of the working tree. To see the version of a specific commit run
scripts/version show {commit}
. This will print out the version of a clean
checkout of that commit.
To create a release branch, run scripts/version release_branch {next version}
and merge the PRs it creates as soon as possible (to limit the number of commits
assigned incorrectly ordered version numbers). This will create a new release
branch: release/v{current version}
. The first commit on that branch suffixes
the version number stored in config/version.txt
with rc1
(e.g.,
0.1.11rc1
). The commit on master immediately after the branch changes the
version number to {next version}
. The release branch should never be rebased
as it represents a permanent history of the release process. We can stabilize
the code as needed on the release branch before releasing the first release
candidate (e.g., 0.1.11rc1
).
To create a release, run scripts/version release {next version}
and merge
the PRs it creates as soon as possible. This tags the current HEAD
commit
with v{current version}
(e.g., v0.1.10rc1
) and creates commits to increase
the version number to {next version}
. The release tags should be annotated
tags to store who created the tag and the date. The same commands are used
regardless of the kind of release: final release (semantic version only),
release candidate (rc
), or post-release (.post
).
To tag an existing version as a release (e.g., to tag a release candidate as
the final release), run scripts/version tag {version}
. This tags the current
HEAD
commit with v{version}
(e.g., v0.1.10
) to mark it as the {version}
release.
To begin work on a post-release (a maintenance release for a previous final
release), switch to the release branch (release/v{version}
) and run
scripts/version bump {version}.post1
and merge the PRs it creates. This will
update the version number to release/v{version number}
in preparation for
work on a post release. After development on the first post-release is
complete, create and tag post-releases just like any other release.
Post-release development is entirely on the original release branch.
Post-releases should be avoided if possible. Master can be merged into the
release branch if needed, or commits from master can be cherry picked into the
release. Changes should not be developed on the release branch. The exception
is for maintenance on code that has been removed permanently from master, but
this should be avoided whenever possible.
Let us walk through a development example showing the steps and commands used. The following tables have alternating rows: the state of the repositories and the computed version, and the actions taken by the development team. Commit hashes are replaced with # for brevity.
The steps leading up to branching for the 0.1.11 release:
Computed version | Repository status |
---|---|
Developer steps | |
0.1.11.dev+1.1.#.# | version.txt = 0.1.11 |
10 commits in katana and 20 commits in katana-enterprise. | |
0.1.11.dev+11.21.#.# | version.txt = 0.1.11 |
Run scripts/version release_branch 0.1.12 and merge the created PRs on both master and the release branch release/v0.1.11. |
The steps on the release branch release/v0.1.11 to release 0.1.11:
0.1.11rc1.dev+1.1.#.# | version.txt = 0.1.11rc1 |
1 commit in katana and 1 commit in katana-enterprise. | |
0.1.11rc1.dev+2.2.#.# | version.txt = 0.1.11rc1 |
Run scripts/version release 0.1.11rc2 and merge the created PR on the release branch release/v0.1.11. Once this commit is tagged v0.1.11rc1 it is also version 0.1.11rc1 and can be released.
|
|
Once 0.1.11rc1 passes QC, run scripts/version tag 0.1.11 to tag the same commit as the final release 0.1.11. Once this commit is tagged v0.1.11 it is also version 0.1.11 and can be released. At this point the commit will have 3 versions: 0.1.11rc1.dev+2.2.#.#, 0.1.11rc1, and 0.1.11.
|
The state of master after branching for 0.1.11:
0.1.12.dev+1.1.#.# | version.txt = 0.1.12 |
Continue development... |
make -C <some subdirectory in the build tree>
This builds only the targets defined in that subdirectory (and their dependencies). This is useful for compiling a single application or library at a time.
make -j 4
This launches 4
jobs to build the system concurrently.
make VERBOSE=1
This prints out the commands that are run. This is useful for debugging installation issues.
ctest -L quick
This runs code tests locally on your machine. Most of the tests require sample
graphs. You can get them with make input
.
ctest -R regex --parallel 4
Run tests matching the regular expression and use 4 parallel jobs to execute tests.
ctest --test-action memcheck
Run tests with valgrind memcheck.
ctest --rerun-failed
Rerun just the tests that failed during the last run.
bin/graph-properties-convert <infile> <out directory/s3>
This converts property graphs into katana form
ccmake: a simple terminal UI for configuring cmake projects. This is helpful for setting various build options without having to remember their names exactly.
ccache: a wrapper around compilation commands to reuse
object files even if their timestamps change as long as the contents of the
source file and compiler flags are the same. To enable use: cmake -DCMAKE_CXX_COMPILER_LAUNCHER=ccache
The LLVM project provides many useful development tools for C++: clang-format, clang-tidy, and clangd. Most IDEs can be configured to use these tools automatically.
Adding new dependencies should generally be avoided since it makes it more likely that satisfying local development requirements, conda build requirements, production library requirements, etc. will become impossible. If you do choose to require a new 3rd party library for a good reason you should:
-
Choose a version of the library that is available both in conda-forge and in ConanCenter. If it is not available in both places, ubuntu package managers like
apt
orsnap
can work but adding it will be different (and you should consider picking another library since this puts an extra burden on developers). -
Add the dependency to the conan config in the style of the dependendices that are already there.
-
Add the dependency to the conda recipe in the style of what's there. There are two sections;
host
andrun
. Any runtime deps need to be added to both sections. But deps which are totally compiled into galois (i.e., they are not exposed in our API and don't require a shared library at run time), can be inhost
only. -
It is possible that you may have to modify the package config as well so
cmake
will find your dependency during the conda build (again the best advice is to look at how other dependencies handle this). This should only be necessary if the new dependency is a runtime or user-code dependency. For instance, this should not be necessary for header-only libraries that are not used in public headers.
If you do end up choosing a library that is not in conda-forge/ConanCenter (really?) make sure to update the dependency list in README.md, and make sure the script for setting up a dev environment is updated as well. There will likely also be changes to the CI scripts that are needed.
You should be particularly weary of libraries that are not in conda-forge. If absolutely necessary, discuss it with the current conda package maintainer (currently @arthurp). Not handling them correctly there will totally break the conda packages.
Printing and its more production-oriented cousin, logging, are simple ways to
get started with debugging, especially if you are in an environment where you
can build executables from source. Just remember to prefix your debugging messages
with an easy-to-find string like XXX
so you can find and remove them later.
For more interactive debugging, you can use gdb
. A typical gdb
session looks
like this:
gdb --args application arg1 arg2 arg3
> break SourceFile.cpp:LineNumber
> run
> next
> print
# edit some code
> make
> run
If you are debugging an MPI application, you can use a command like mpirun -np 4 xterm -e gdb application
to spawn a gdb
session for each MPI host or use
tmpi which will spawn gdb
sessions in
tmux
panes instead of xterm
windows. These commands work best if all the
MPI processes are running on the same machine. If not, you will have to work
out how to open connections to each worker machine. The OpenMPI project gives
some pointers, but in
practice, it is usually easier to fallback to print-statement debugging or
trying to reproduce your issue on a single host if possible.
An alternative to running a debugger is to load a core dump. Most machines disable core dumps by default, but you can enable them with:
ulimit -c unlimited
sudo sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t
And you can load them in gdb
:
gdb application -c core-file
Many tests require sample graphs that are too big to keep in this repository.
You can get them with make input
.
If you need to update the inputs, they are referenced as
https://katana-ci-public.s3.us-east-1.amazonaws.com/inputs/katana-inputs-vN.tar.gz
in .github/workflows
, inputs/CMakeLists.txt
and
external/katana/python/katana/exmaple_utils.py
. vN
is a monotonically
increasing version number. You can use a command inputs/update_inputs.sh
to
create create a new input collection. After creating the tar file, you will
need to upload the file to the public S3 bucket.
Tests are just executables created by the CMake add_test
command. A test
succeeds if it can be run and it returns a zero exit value; otherwise, the test
execution is considered a failure. A test executable itself may contain many
individual assertions and subtests.
Tests are divided into groups by their label property. A test can be given a
label with the set_property
/set_tests_properties
command, e.g.,
set_property(TEST my-test PROPERTY LABEL my-label)
So far there is only one useful label:
- quick: Quick tests have no external dependencies and can be run in parallel with other quick tests. Each quick test should run in a second or less. These tests are run as part of our continuous integration pipeline.
We automatically test for some code style adherence at merge-time; so the automatic tools have the final word. Nevertheless, there are some conventions enforced by maintainers to be aware of:
-
TODO
/FIXME
/XXX
are common comments used to denote known issues that should be fixed soon. In this code base,XXX
is reserved for local changes that the author intends to fix immediately, i.e., comments withXXX
should never be committed.TODO
andFIXME
should only be used with the name of a contributor who is responsible for fixing them (usually you) in the style:TODO(thunt)
. In generalTODO
tasks involve more work to fix thanFIXME
tasks but the distinction isn't enforced. -
Dates in comments should always take the form
YYYY-MM-DD
. -
In general, we try to adhere to the Google C++ Style Guide. One exception (so far) is that when passing arguments to functions, prefer pointers to non-const references (which used to be a part of that guide).
The following are the automated style checkers that we use. The continuous integration process will check adherence on each Pull Request, but it is faster, in terms of feedback latency, to run these checks locally first.
scripts/check_docs.sh
: ensures that you didn't break doc generationscripts/check_ifndef.py [-fix] lib*
: checks that header guards are well formed.scripts/check_cpp_format.sh [-fix] lib*
: appliesclang-format
to check style.scripts/check_go_format.sh [-fix] .
: appliesgofmt
to format go code.scripts/check_go_lint.sh .
: appliesgolangci-lint
to check check style.scripts/check_python_format.sh .
: appliesblack
to format python code.
We also have a clang-tidy
and pylint
configuration. Both these are useful
tools for checking your code locally but they are not currently by continuous
integration.
None of these checks are exhaustive (on their own or combined)!
New code is expected to follow the above guidelines. But also be aware: there
are older modules that predate these expectations (currently, older modules are
everything that is not libtsuba
, libsupport
and libquery
). In those cases,
it is acceptable to follow the convention in that module; though in general, it
would be preferable to follow the motto of "leave the codebase in better shape
than you found it."
Code that needs to indicate an error to callers typically should use
katana::Result
. When a function returns katana::Result<T>
, it means that
the function either returns a T
or an error. When a function returns
katana::Result<void>
, it means that either the function succeeds (i.e.,
returns void
) or returns an error.
A common pattern for using a function that returns a katana::Result<T>
is:
auto r = ReturnsAResult();
if (!r) {
// Some error happened. Either...
// ... we handle it
if (r.error() == ErrorCode::Foo) {
return DoAlternative();
}
// ... or we propagate it
return r.error();
}
// No error happened. Continue on.
T value = std::move(r.value());
See Result.h for more details.
If you are looking to simplify error handling, if a function returns a
katana::Result<void>
, you can define the result and check it in a single if
statement:
if (auto r = ReturnsAResult(); !r) {
return r.error();
}
Code should be exception-safe, but exceptions are rarely thrown in the
codebase. Exceptions are reserved for situations where it is equally acceptable
to terminate the current process, which is rare in library code, or as a safe
implementation of setjmp
/longjmp
, which is even rarer.
Errors are a part of the contract between a caller and callee in the same way parameters and return values are. When writing an error message or selecting an error code, consider the perspective of the caller and their natural first question: "how can I make this error go away?"
Good messages should be to the point and in terms of the caller and not artifacts of an implementation detail in the callee.
Compare
Result<void> CheckNumber(int number) {
if (!number) {
int u = n / number;
} else {
return KATANA_ERROR(IllegalArgument, "cannot divide by zero");
}
}
and
Result<void> CheckNumber(int number) {
if (!number) {
int u = n / number;
} else {
return KATANA_ERROR(IllegalArgument, "number should be positive");
}
}
As a matter of consistency and style, messages should begin with a lowercase letter to avoid switching between different case styles when errors are propagated.
For example,
Result<void> MakeList() {
...
if (auto r = CheckNumber(n); !r) {
return r.error().WithContext("making number {}", n)
}
}
will create error strings like making number 0: number should be positive
.
On older compilers, auto conversion to katana::Result
will fail for types
that can't be copied. One symptom is compiler errors on GCC 7 but not on GCC 9.
We've adopted the workaround of returning such objects like so:
Result<Thing> MakeMoveOnlyThing() {
Thing t;
....
return Thing(std::move(t));
}
If the error is due to a transient external failure, you can re-run jobs in the GitHub UI.
When debugging a CI failure, it is good to confirm that tests pass locally in your developer environment first. The CI runs on the merge of your PR and the branch you want to merge with (usually master), so if you have issues reproducing locally make sure your PR branch is up to date as well.
You can also run many of the source checks locally as well (usually
scripts/check_*
), and most of them accept a -fix
option to automatically
correct the errors they check for. Take a look at the GitHub workflow
definitions under .github
directory to see what script and build parameters
are used.
GitHub actions allows for build data to be cached between CI runs. For
reference, the caches (actions/cache
) are scoped to
branches. The cache matching
policy is:
- Exact key match on the current branch
- Prefix match of a restore key on the current branch. If there are multiple matching keys, return the most recent entry.
- Repeat from 1 for the default branch
Keys should be unique because once a cache entry is created it will
never be updated by actions/cache
.
If you need to create a cache that simply stores the latest values, create a
common prefix with a unique suffix (e.g., github.sha
) and use the common
prefix as a restore key. The unique key will not match any existing key but
upon lookup there will be multiple matching cache entries sharing the common
prefix, and actions/cache
will return the most recent one.
One common use of actions/cache
is to store a ccache cache. There is no limit
on the number of caches, but once the overall size of a cache exceeds 5 GB
(compressed), GitHub will start evicting old entries. 5 GB isn't particularly
large for a ccache so we currently manually limit the size of each ccache to a
certain number of files (ccache --max-files
) to more directly control cache
behavior and ensure fairer eviction among GitHub caches. The downside is these
limits need to be periodically reassessed.