Buck Extension Language (BXL)

Buck2 will allow more complex introspection and interaction with its graphs via the bxl feature. BXL will be a starlark script that allows integrators to interact with buck commands like build and query within starlark, creating a sequence of operations that introspect, build, and extend the build graph.

These are essentially custom buck operations, defined in Starlark, that still follow the constraints of Buck2, which will enable the same level of incrementality and caching as native buck2 operations. Furthermore, bxl will have subscriptions enabled in the future, where based on the incrementality tracking, buck2 can provide "updated" bxl executions when its known that its dependencies change, and even when generated sources need to be regenerated.

The following proposes a basic set of bxl api and building blocks that are targeted at solving key issues for IDE integration.

Use Cases

Cpp LSP

Lsp prefers to have a single buck command that given a file, returns the corresponding compilation database. This requires a single command, i.e a bxl, that accepts a file as input, performs owners queries, and uses the owning target plus the desired file to get the clang flags, and then writes it to disk in comp db format. It’s possible to write the same features using buck calls to cquery, and build using subtargets to generate compilation database per file. However, this requires lsp owners to maintain code in several locations and languages, and parse and reserialize data. It also does not provide the same incrementality and subscription update features of the resulting comp db that writing this in bxl would have. Furthermore, we may explore the idea of trimming the compilation command to only dependencies required per the file requested. Bxl actions provides a straightforward api for adding this when writing the actual comp db file.

Android LSP

Android project requires traversing the target graph to find and java libraries, grouping and converting them between modules or project libraries depending on the number of references, and restructuring the graph as directory based. Android LSP is able to take advantage of subscriptions in the future when available, allowing developers to keep their IDE up-to-date automatically without needing to manually regenerate the project.

With bxl, the graph traversals can be written in starlark, allowing propagation of information down the graph, accessing targets’ attributes to analyze dependencies, and access providers for artifacts and action information needed to output the project file. Project generation also performs directory listings that buck2’s dice already performs and caches (I think, need to confirm). Bxl poses the interesting possibility that we can expose a limited set of IO operations that are tracked by dice so bxl can access the same cached file operations as rest of buck2. Android project generation currently doesn’t write project files to buck-out, which prevents it from using buck2 actions. It will have to rely on an external script to process the graph information printed by buck and write the actual project files. If it moves to buck-out based, then it can take advantage of creating actions directly using the graph information processed, and potentially take advantage of incremental actions api to avoid writing the entire graph on each subsequent update.

iOS Project

iOS is currently being implemented as a series of queries that are aggregated by an external python script, that then invokes builds of subtargets. The same can be achieved in bxl, but with the entire sequence being cacheable and subscribable so that when the graph is updated, or even when generated files need updating, buck2 can automatically push the updates. However, it is uncertain whether xcode itself can make use of push updates.

Rust LSP

(note from dbarsky@: I’m adding this at Bob’s request. Can be removed as needed.)

Visual Studio Project (vsgo)

Vsgo is a pile of python that converts buck query/buck targets output via a variety of heuristics into inputs to a custom fork gyp which is then invoked to generate visual studio projects for a given buck target. Having direct access to the internals of buck would allow us to remove the heuristics and possibly even move project generation directly into bxl.

Goals

From the above use cases, BXL should offer a simple Starlark API that allows easy introspection of the buck2 graph at unconfigured, configured, providers, and actions stage, maintaining incremental behaviour of the BXL evaluation itself.

Some minimal API should be offered to allow BXL to provide additional behaviour such as output artifacts, and print results.

Most use cases from LSP desire to also propagate information via the command line for these operations, so BXL should support command line arguments as inputs.

API

Defining a bxl function

There are multiple models possible. We can have each file be its own bxl, or have each file declare multiple bxl like rules.

There are multiple advantages to allowing declaration of multiple bxls, such as grouping similar bxls in the same file, allowing them to "invoke" each other. It doesn’t necessarily add much more complexity for the author, as even with one bxl per file, the author still has to have some declaration for the bxls arguments.

# sample.bxl
func1 = bxl_main(
   impl = my_func1,
   args = {
     "arg1": arg.list(arg.str()),
   }
)

func2 = bxl_main(
   ...
)

To invoke buck2 for that bxl, we can have the command line as follows.

buck2 bxl sample.bxl::func1 -- --arg1 foo bar baz

For bxl functions to read the arguments, a similar api to rule attrs is used

args = ctx.args.args_for_bxl

Args defined like attrs when declaring the bxl function above

Accessing target nodes

All standard query functions will be enabled in bxl, allowing users to run query operations, storing them in variables and interacting with them. These allow introspection of the unconfigured targets, or the configured targets based on api

# some.bxl
targets = ctx.uquery(‘deps("//foo")’)
targets = filter(targets, my_filter)

# introspect a target
for target in targets:
  ctx.print(target.attributes) # prints selects
  # also inspect the target like below
  ctx.print(target.label)

target = ctx.cquery("//foo", "//x86").attributes # cquery has selects resolved

Inspect providers

When we have a configured target, bxl can request for the analysis of the rule

target = <some configured target>

ctx.analysis(target).providers # access the providers

Actions

For IDEs, to generate compilation databases, or generate project files, writing them in bxl will entail creating actions, and executing them. As such, bxl will also be given the rules api to register actions, including dynamic outputs for the rule in the current bxl invocation to build artifacts as part of a bxl function.

BXL has the ability to create actions with some constraints:

Action is tied to a particular target
It’s output location is determined in the same pattern as regular actions defined via rules

targets = ctx.cquery(‘deps("//foo:rule")’)

for t in targets:
  action_ctx = ctx.analysis(t).actions
  # the action context here is tied to the configured target `t`
  # actions registered by bxl will be attached with bxl prefix key
  action_ctx.registry.write(some_output, "foo")

BXL can also interact with the existing actions on an action via the action_ctx, such as iterating through it, analyzing its outputs, or requesting it to be ran.

targets = deps("foo:rule")

for t in targets:
  action_ctx = ctx.analysis(t).actions
  for action in action_ctx.iter():
    if "foo/path" in action.output:
      ctx.build(action)

What is cached?

All computations requested by a bxl function will be treated as inputs. So if a bxl function calls uquery, then uses the result to do a cquery, and then a build, if buck2 detects that any of the recorded calls to uquery, cquery, and build changes, the entire bxl will be reran, with no early cutoff. The computations itself will still be cached via DICE, so no major performance issues are expected. However, in the event that a bxl function is computationally heavy, the recommendation would be to move that to an action, or split up the bxl and use inter-bxl caching described below.

Inter-bxl caching?

Different bxl can be cacheable between each other if structured as "outputs"/artifacts. This is essentially the same behaviour as a bxl requesting ctx.build, which is cached. Since we have those as hashes on RE, we can track properly and not require storing the values in dice.

i.e.

# caching_sample.bxl
func1 = bxl_main(
   impl = my_func1,
   args = {
     "arg1": arg.list(arg.str()),
   }
)

my_func1(ctx):
  …
  # do various stuff that might change a lot, but the final result
  # doesn’t change much
  ctx.return(some_artifact)

func2 = bxl_main(
  impl = my_func2,
  ...
)

my_func2(ctx):
  artifact = ctx.bxl(":func1")
  # now read artifact value
  # everything below will only be reran if the artifact content changes
  …
  # do some expensive stuff

Buck Extension Language (BXL)

Use Cases​

Cpp LSP​

Android LSP​

iOS Project​

Rust LSP​

Visual Studio Project (vsgo)​

Goals​

API​

Defining a bxl function​

Accessing target nodes​

Inspect providers​

Actions​

What is cached?​

Inter-bxl caching?​