# Word Scoring project for Outreachy / GSoC

**Status:** Final | March 2025

The [Word
Score](https://jrb.pages.gitlab.gnome.org/crosswords/devel-docs/word-scores.html)
project has been proposed for 2025 GSoC and Outreachy. Multiple people
have inquired about what should be done to write the proposal.

The general recommendation for most GSoC projects is to build the
project and try to fix some bugs. However, this project is different
enough from the rest of the code base that the normal set of
[newcomer](https://gitlab.gnome.org/jrb/crosswords/-/issues/?label_name[]=Newcomers)
bugs isn't useful. Instead, we need an alternate approach. This doc is
an attempt to provide guidance and answer some questions.

## Project

Prospective interns should attempt to calculate the bigraph and
trigraph scores as listed in the [design
doc](https://jrb.pages.gitlab.gnome.org/crosswords/devel-docs/word-scores.html).

They should start by making a fork of the crosswords repo and then
work in their fork within the `word-list/` and `tools/`
directories. Their proposal should include a link to what they did and
explain how they went about it. I'd also like to see a doc describing
their scoring approach.

Proposals will be assessed on some the following characteristics:

* Demonstrating independence / problem solving ability
* Knowledge/ability to work with Git.
* Quality of the proposal for the subproject
* Ability to write clean python code
* Ability to generate data and understand what has been generated
* Plan for storage of intermediate data
* Actual score value!
* etc..

And as a bonus:

* Understanding how to pipe the data through the structure to the C
  code.

If they have relevant experience / code / contributions outside of
this project, they should include those as well.

:::{note}
I don't expect everyone to be able to check off all these
boxes in the proposal phase. Work here will seamlessly translate to
the actual project. Making progress on it is a good beginning.
:::

## Background links

* [Word Score Design Doc](https://jrb.pages.gitlab.gnome.org/crosswords/devel-docs/word-scores.html)
* [Build and run instructions](https://jrb.pages.gitlab.gnome.org/crosswords/devel-docs/build-and-run.html)
* [Intro video](https://www.youtube.com/watch?v=pSr6RSY0_fM)
* [word-list/ directory in crosswords](https://gitlab.gnome.org/jrb/crosswords/-/tree/master/word-lists?ref_type=heads)

## Requirements Reminder

Just a reminder that this project will require a somewhat stable
internet and a machine with sufficient storage to download a 20gb data
set. I also want people to be able to run a linux desktop so that they
can develop in the same environment that crosswords is developed in.
