KREEP, apples and penguins.

Hi everyone!

Haven’t written for months now about Operation KREEP. It is time to revisit this old buddy bud bud of mine!

Yes, as the title suggest, it is cross-platform time 😉 …

2017_03_23_boxartmultiplatform

Not official, but soon…

Nope, sadly no official release yet 😦 , but the Linux build is ready and tested (at least on my Nix machines) and the MAC build is ready for testing too. This means, that in a week or two an official release can happen, although a little piece of the puzzle is missing.

I require additional pylons!

I have two PCs, so I tested the Linux version of the game on two Ubuntu versions, but more would be nice (zillion distros 😦 ) + I HAVE NO MAC MACHINE 😦 …

This means, that the MAC build essentially never ever been started! I would really love to release the cross-platform builds, players already asked for them, but without sufficient testing it is not going to happen. Buying a MAC would be a somewhat logical investment at this point, but Operation KREEP (and my whole game development venture for that matter) is on an extremely tight budget as it is not profitable so far, so I will try to postpone that a little.

Feedback, results, “compensation”

Based on the differences between the builds (almost 0 code change, only packaging varies), I think a few simple checks would suffice. Whether installation works (files copied, icons set etc…), whether the game starts and basic configuration settings checks (settings work and are saved to correct application data folders) + a short test play round just for fun 😉 .

I know it is shady to ask for free QA for a product, but this is the reality of the situation I’m in 😐 . If you would like to help out I thought about sharing a limited amount of Steam/itch.io/IndieGameStand keys for the full game as a “payment”.

I put together a short form to ease reporting results: KREEP, apples and penguins

If you dislike sharing any personal information, but still would like to help out, please simply post results as comments here or contact me by e-mail: spidi@magicitemtech.com

I guess contact info of a cheap&used MAC reseller in Hungary could help too if you know any 🙂 .

Demo builds

2017_03_24_logo_mac 2017_03_24_logo_nix

Porting tech stuff

Just a little tech talk as closing words. The windows version of the game was made in C# using XNA. Two really cool projects were born to both preserve and enhance XNA in the last few years. MonoGame and FNA. Both are great and well established/tested at this point, but I choose FNA for porting Operation KREEP to Linux and MAC. My reasoning was the following:

  • Around a year or two ago when I was using MonoGame to work on my Linux machines I encountered some difficulties. MonoGame on Nix platform was using OpenTK for window, input and OpenGL context management and as I know, that library had it’s fair share of bugs and there was no real support/contribution/fixes for it for a long while.

    Remark: as I know the MonoGame team changed to SDL2 lately, the same library FNA uses under the hood so it is probably not the case by now. 

  • MonoGame favors a per-platform build approach, which looking at all the possible target platforms (desktop, mobile, consoles) is a logical choice, but requires managing and building multiple executables for each target. FNA from the get go approached this with a common desktop runtime, so one build works on all major desktop platforms (only packaging has to be taken care of per target).
    Remark: if I’m not mistaken, last year a “common desktop” build was introduced for MonoGame too, so technically it could work the same way as FNA for desktop.

  • The FNA developer Ethan Lee had laser focus on cross-platform desktop XNA development and delivery, and the wiki for FNA had a really nice documentation about both working with FNA (differences and extras compared to XNA) and packaging + delivering games using it for Windows, MAC OS X and Linux. This documentation seemed really helpful and complete.

All in all I suspect both libraries could work perfectly for publishing your games to the three major desktop platforms, but I wanted to give FNA a try too. I was pleasantly surprised, most things worked like a snap without much fiddling.

That is it for today, in a few days I’ll post a new video & blog entry for I am overburdened. If you decide to help MUCH LOVE, SUCH WOW 🙂 and thanks awfully!

Take care!

Magic Item Tech, testing – part 3.

Hi all!

It is time for some software testing framework talk again 🙂 !

As you know, I’m a big quality/testing advocate (especially when it comes to software!!!), and I’ve been working for a while now, in my “lab”, on a high-level testing and automation framework, since I’ve reached the limits of usefulness of unit testing my game projects.

I’m going to showcase the newest addition to this testing framework, so if you completely missed it and interested, here are the two older posts summarizing the design and some implementation details ( with showcase video 😉 ) about it:
Testing – part 1.
Testing – part 2.
A quick recap, if you would not like to read through the usual pile of text 🙂 , but still interested in this dev. log. entry:
It is a “capture and replay” based framework, where you can record (or edit/create) input events (or special ones, e.g.: network, debug events etc…) from the game to replay them later. For checking certain functionality, the replay files can be filled with assertions targeting the properties of any game-object within the game world at any given frame of the replay.

So the system served me well while developing the Operation KREEP update. It took less than two work days to reach a pretty high coverage of game-play features and the UI work-flow (around 80% code coverage in 12 hours). This test-suite helped me a lot while releasing the Steam version, besides keeping my sanity by covering and assuring, that most of the high level features work, it saved me from introducing a few pesky bugs while coding the new features!

After a while though, execution times grow as it is to be expected. The 67 replays for the final Steam build(s) which checked the UI work-flow, completed the tutorial, played till getting most of the achievements and so on and so on, requires around 11 minutes to fully execute ( more than a cup of hot beverage takes from inception to getting it into the belly 😀 ). I knew, that after a while this is going to happen, worked before with similar frameworks, but also already had some ideas in the back of my mind to fight it if it may become a problem. Obviously this was not a big irritation yet, but for a larger game it may become a certain source of frustration.

2016_07_17_Execution

The most simple and obvious solution was categorizing test-cases within a suite (UI, Options, Graphics, Tutorial etc…) and only execute immediately required categories. This took only a short time to develop and configure as NUnit already had support for it. I just had to put some extra properties here and there.
This was nice and worked well, taking less to test the crucial/modified parts faster, but of-course there was a much smarter idea there from where this one came from ( a full test still required 10+ minutes 🙂 , no way I could not improve on that 😉 )!

    
    
        
        
            Match Pause Restart
            Tests\ReplayMatchPauseRestart.xml
            
                
                Match
                Pause
            
        
        
    

2016_07_17_Categories_1  2016_07_17_Categories_2

So I investigated two common solutions to this problem (actually there is a third one which is manual: modify test-cases to make them simpler and shorter, but that is not a general solution, takes time and the gains are small), and I went with the “speeding up test-case execution” route. I haven’t find any good/common name for it, although it is a known solution, so I called it the “unlocked” game-loop. The concept is simple: when replaying the test-cases a different game-loop is used, which runs as fast as it can ( no vsync no sleep nothing like that, exercising the CPU/GPU like a mad man 😀 ), but the elapsed, total and accumulated time calculated and passed to the systems and game-objects of the game is mimicking the normal game-loop, so the game “believes” it is running at normal speed with the target 30 or 60 frames per second. I was certain that it is going to speed up execution, and at least cutting execution time in half with a simple game. I was wrong, it became much much faster 😀 . After the new game-loop, the full test set took not much more than 2 minutes instead of 11…

Take a look:


Note: the first two minutes show the normal execution of 6 test-cases and the rest of the video show the same tests executed with the “unlocked” game-loop.

The approach has some down-sides (as with every route in software development), e.g.: the game may use system time for certain features (although I think this should be avoided, since the game-loop provides an elapsed total time of the game execution handled the same way as the elapsed/accumulated time) and another one is full-screen/windowed mode toggling which is not supported by this feature at all (maybe in the future, I guess it could be done, just it would require some hacks, don’t know yet). For these problems I “cleverly” introduced a per-test-case setting to override the game-loop-unlocked configuration, so the execution speed-up can be disabled for “unstable” test-cases.

    
    true
    
        
        
            Toggle Fullscreen
            Tests\ReplayToggleFullscreen.xml
            
            false
        
        
    

Another “I’m not so happy about it” thing is, that it is a bit hackish and fully platform dependent solution currently, but I guess in time I will solve this problem 🙂 .

As I mentioned there was another route I could take to speed up test execution. I think it is a somewhat superior solution, but would have taken much more effort both software and hardware wise, so I decided to go with the simple one. NUnit has an open parallel execution engine add-on, and I think it requires no explanation why that route is superior, since the limiting factor would only be the number of machines I could harness, but setup (and stability?!) would be a much more complex issue. In time I may try it out, since I’m interested in the actual setup time it takes + I’m certain with a couple of boxes execution time would match the time it takes to run a unit test set 🙂 , but the current solution fully satisfies my needs and my work-flow.

The testing framework is in an extremely stable and usable state for production by now. I’m going to make good use of it for my current game too. In time I’m planning to add more features to it, so a “testing – part 4” entry may happen 🙂 , but not anytime soon + most probably I will only focus on smaller, usability enhancements and additions.

I’m still working on some framework-y code from time-to-time (maybe next entry will be similar, mostly technical) and the current game project is not yet ready for announcement, but with this game I’m going to do a more open development. So starting from the working prototype till the finished product, I’m going to post (weekly maybe?) releases with limited content to get feedback and improve on usability, balance and overall features from the first days and to reach more players interested in the game, even before going to greenlight/itch.io/hopefully-Steam etc…
Expect a somewhat playable version soon (I think within weeks).

Take care!

Magic Item Tech, testing – part 2.

Hi everyone!

It’s been a while since my last post. Lately I had to focus on a lot of things (KREEP, Steam, paperwork, framework development, day-job, health issues, new cat 🙂 !!!), and I wound up not being able to actually focus on any of it…
Now I’m trying to catch my breath, doing the best I can to organize my time better, and I’m going to restore the habit of telling the story of my progress in game-development land (getting back to posting every week or two). Starting it, by delivering a long ago promised follow-up for the game software testing topic 😉 !

In the last post I summarized the design goals of the (halfway-)developed framework. The main purpose was to create an automatic testing system which provides a trivial way to create and extend high level (integration/functional) test cases for my game projects. Since then, I finalized most of the features and inner workings, and made a regression set for KREEP with a pretty decent code and functionality coverage.

The testing system and work-flow is based on “capture and replay”. This means, that for creating a test case you actually play the game, trying out various facets of the program, and the engine captures events (e.g.: input events, key and game-pad presses, etc…) while doing so. These events are than saved to a file, and the engine itself can run in “replay” mode, to replay these events later on. This alone would not allow things to be validated in an automatic fashion, but a special event type serves the role of replay based checks. These “Check” events implement the same interface as other common events, so replaying them can be done right before a given “Update” call, and they define specific assertions for game objects in the running game. Since components have a string identifier tag, a check can search for given entities (like the player, or a map element, or any enemy monster etc…), and with a little reflection magic, assert on any of the properties of these components. Filling the replay files with these checks to be done before given “Update” calls creates the actual validating automatic test cases.

Here is the class diagram (simplified) showing the architecture. It’s clearly visible, that the record & replay systems and their building blocks are mirrored (as their functionality/goal) and it is easy to extend both systems with introducing new leaf implementations (recorders and events):
2016_02_14_UML

I’m already experimenting with other “Check” event types. Screen-shot compare check compares the state of the back buffer to a given reference image. This approach has both great advantages (e.g.: sanity checks for rendering/graphics code + validates huge amount of the functionality leading to a given state of the game) and disadvantages too, since it is quite unstable (changing a sprite or a model a little can cause the comparison to fail, but smart comparison algorithms, like histogram or color-channel-distance based comparisons can help) + they are not really helpful until the game is (or at least larger parts of it are) in a quasi finished state. This is why I haven’t based the validation aspect around this approach, and why it is still not a fully flashed out part of the test framework. Game-object hash value checks will be a somewhat similar beast. They are just like the standard property checks, but instead of asserting on scalar values/properties, the hash-code of a game-object (Object.GetHashCode) is recorded and checked when replaying. This is also quite fragile, because adding a new component or a new property to a game-object could break a test, so it is a type of check which is more useful when larger parts of the code approaches the finished status, but it can validate a huge part of the game state too! At least it is not hard to update broken but not actually failing tests with new hash values and screen-shots…

For achieving deterministic playback (at least supporting it in the lower level systems), the events are captured and replayed on a specific “step” of the game-loop instead of using timestamps, so a space-bar press polled on the 15th call of the “Update” function is played back right before the 15th “Update” call. For this to work as intended a “fixed delta time” game-loop is ~required, but it is not a hard-coded limitation, since both the record and replay systems support extensions (as seen on the UML diagram), and optionally a delta time can be saved for each step and replayed again as the delta time since the last “Update” call (viola, deterministic replay for “variable delta time” game-loops). Another aid to reliably test stochastic parts of the code, is seed events, usable to capture the seed of a random number generator and reset a given generator to the recorded seed when replaying right before the set game-loop step. Later on if a game or some parts of a game become non-deterministic, I hope, that due to the events are actually being a higher level abstraction, not tied at all specifically to input devices and input events, could be used for replaying non-deterministic game sessions with special game events instead of capturing input (e.g.: disabling a non-deterministic physics system upon replay and relying on “PhysicsDiagnosticEvent” instances).

As I mentioned, the events are serialized to a file for later replay. I chose XML (but could be anything similar) since I already have a lot of code, helpers and tools built for working with this format + I find it quite comfortable (+ a natural choice for .NET). Here is a simple replay file containing only a few key press events:



    
        
            
        
        
            
        
        
            
        
    

To be able to better organize test cases (extract common parts), and to aid the creation of test cases by hand instead of capturing game-play footage (really useful for UI/menu tests), I’ve implemented an “include” attribute for the “EventStrip” type, so that the contents of a replay can be defined in multiple files. Event strips are actually specific event implementations containing a list of “relative” events which can be replayed/started at a given frame relative to the starting frame of the strip itself. This way multiple events can be replayed “parallel”, and it is easy to capture multiple separate event footage and play them combined simultaneously:



    
        
            
        
        
            
        
    

To be as compact as possible, both memory, disk-space and mental-health wise :D, the basic building block, the “DiagnosticEvent” class is not defined and implemented as a “once-only” event like in most event architectures. It has a duration, and any concrete event implementing it’s interface can decide to span over and be played for multiple “Update” calls. The most common example is a key-press. There are multiple ways to capture and replay a user pressing a key, than later on releasing it. Most common approaches are with their cons. against them:

  1. Save keys in pressed state every single frame as a distinct event. This takes an awful lot of memory and disk-space, and it is close to impossible to edit by hand…
  2. Save two events for each press, an event for pressing and an event for releasing. This is a much much better approach than the first one, but I still hated it’s concept, since any time you wold like to edit an actual key-press, for example make it happen a couple of frames earlier you have to modify two events, and you have to make sure they align well, the frame numbers are correct, the release event is not missing etc… since you may accidentally end up with a replay which presses a key and never releases as a bad example.

The third approach, which I used, and I think is the most feasible solution, is one event which can define how many frames it spans over. As an example a player presses fire (e.g.: left mouse button) and holds it down for 30 frames. That is one event that should be replayed for 30 frames from it’s defined relative starting frame. This way it is easy to make a press take longer or shorter. Also to move around a button press within a test-case, e.g.: to make it happen earlier or later on, only one number has to be modified 😉 !



    
        
            
        
        
            
        
    

Here is the last XML example, a simple check used in the test suite for KREEP, requiring, that the first player (Red) is alive. The game-object for this player is tagged “Player One”, the players are contained within a game-object composition tagged “Players”, and the root component of the game is the “ScreenManager” which doesn’t need more explanation 🙂 .



    
        
            
                
                    
                        
                    
                
            
        
    

If this check is included for a given frame, and while replaying, on that frame the value of the “IsAlive” boolean property of the game-object is false, or the game-object is not found an exception is generated. That is how I validate things, and hopefully discover early if I mess stuff up with my modifications.

The last big magic trick I added to this whole mix is a test-case runner system. I’m a big “fan” of one-click stuff ( who isn’t 😛 😉 ? ). I’ve looked around how to do this, and since I’ve been using NUnit for a while now, it was my first trial/choice. Thankfully NUnit has both a command line and a gui based test-execution application, proper result reporting, and a built-in way to programmatically generate test cases runtime! So I’ve built a simple ~application thingy which generates test cases for the NUnit harness from replay files and some meta data ( again in an XML file 😀 ). When these tests are executed by NUnit, the glue app simply launches a pre-defined application linking my engine, e.g.: KREEP, starting it in “replay” mode and feeding the path of the replay XML file to be loaded and run ( achieved with huge amount of reflection magic and some extra inter-process naughtiness 😀 ). If no exception occurs during the replay, it shuts down the game, nice and clean, than advances; otherwise the un-handled exception propagates to the application domain border, and the glue app fetches it with some inter-process serialization magic (again) to make NUnit know about the failure situation and cause. All in all the glue app is pretty small, has no dependencies at all besides NUnit, it utilizes some tricks ( a.k.a hacks 😛 ), but nothing out of the ordinary (actually pretty “common” stuff for core .NET programmers), and as the last and best addition, it will work out of the box without any modifications for any game project which is built upon my framework (no special implementation/preparation is required from the game application either!).

I recorded a little footage with Bandicam to show how this looks like in action. In the first part of the video, I execute three selected test-cases, all passing. Than I edit the third case to add a deliberate failure. This modified case checks the “Energy shield” mutator. It expects, that when a match starts with this mutator set, all players have an active shield, and a laser shot hitting the second player will not score a kill, but the shield will be disabled right afterwards. This expected boolean property (ShieldActive) is changed to “true”, which is obviously wrong, as the shield do wear-off right after the shot, and the test runner signals the failed assertion:

This way, I just have to press a button, and within a couple minutes I know whether a new/modified version of KREEP is ready to be released or not.

Lessons learned, conclusions, plans for the future:
It exceeded my expectations. I know it’s my brain-child and stuff :D, so who else would be pleased if not me, but I do believe it is going to make my life much easier with game releases and patches in the future, and probably will help a lot mid production phase too. It took approximately two work days, recording and creating test-cases, to reach an 80% code coverage on the KREEP code base. This framework is a pretty decent result and I’m happy for making and having it 🙂 ! Also there are a lot of handy utility features already built-in, since I upgraded some parts while I was using it to make it more comfy, but this post is already enormous to talk about all those stuff 😀 …
A “limitation” which I’m going to fix for my next project is the time it takes to run a full test. It is not yet unbearable or anything ( it takes approximately 5 minutes for KREEP, so a coffee and/or a cup of tea 🙂 ), but for a project with a lengthy single player campaign it could take “too” long, and parallel test-case execution (which NUnit supports) would not help too much (though with save-games it could be helped). A simple antidote to this, on which I’m already working on, is a special game-loop, which “fakes” a fixed 60 times-per-second update rate, passing down 16.66 elapsed milliseconds to game-objects, but actually steps the simulation as fast as possible ( poor CPU 😀 😛 ), so to speak achieving a fast-forward speed-up mode.

This post became pretty lengthy and heavily technical, but I wanted to share my latest big achievement in detail (yes, I love tech-talk…).
Meanwhile the work on the Steam release for KREEP is ongoing. It goes much slower than I expected, so the early march release is in danger currently, but I’m doing the best I can. Not all is lost yet. The paperwork is done, I’m a Steamworks partner, it’s official :), and I’m working on integrating the SteamApi. Also working hard to add extra content for the release (achievements yeah!!!). I hope it’s going to be cool.

Next time I’ll do a more detailed status report on KREEP+Steam…
Stay tuned!

Magic Item Tech, testing – part 1.

Hello all!

Haven’t written for a long time, but I’ve been busy working on my “tech”, mostly preparing for my upcoming project. When I started to think through what I would like to write about, I’ve realized, that it is going to be a rather long one, hence the “part 1” in the title, as I’m planning to continue this topic in a week or two.

Now onto some tech talk, but before I go on, I have to tell you, that I’m a BIIIIIG software testing advocate, and this topic will mostly be about it. If you think, that thing is gibberish, do not continue 😉 !
So, I’ve been working in the last few weeks on improving my testing work-flow and the tech supporting it. As I’ve probably mentioned before, I have a small but really STABLE code-base, which I nicknamed “Magic Item” (lets call it framework from now on). I use this framework to build games. It is based on XNA, but mostly uses it for rendering, sound and input handling, and provides a bit higher level services, like a game-object framework, animated sprites, collision handling etc., that are common to all games regardless of their type.
I’ve emphasized stable for a good reason. Every single function, that gets into this framework is unit tested and documented. I’m confident, that it is close to impossible to develop software (especially big ones) in a long run without a proper automated regression and sufficient documentation. The way to achieve it, can be argued upon and is somewhat a personal preference (some like unit tests, some do not and prefer more focus on module/functional tests, some like api docs, some do not and prefer more of a feature based high-level documentation not tied to the code), but I hope many agrees, that it is a must (yes I’m being naive, I know).

2015_12_21_NUnit
I love the look of green test results in the morning :D.

2015_12_21_OpenCover
This means I’m pretty thorough :P.

I’ve tried to create a lot of games before, from which many failed due to being too ambitious ideas or me not being persistent enough, but usually I could at least salvage some code from these projects, and build it into this little framework of mine. I’ve been developing the framework this way for years now, and every time I stumbled upon a feature, which could be useful for many games, I properly integrated it, by designing, implementing, testing and documenting the code. It is a small library, since this has always been more of a hobby endeavor, but due to it’s level of polish, working on it or with it to create a game cheers me up!

KREEP Banner

Then came Operation KREEP. This was my second completed game project, but I’ve realized something really important during the development. I had to write a lot of code, specific only to this game, and I think this is pretty shameful, but I had no proper regression to back it up. In the last weeks of development I’ve been doing hours of manual testing just to make sure I did not break anything accidentally. I considered this a failure from a developer perspective, since I perfectly knew what I was doing and still did not prepare, wasting a lot of time. Though I also thought, that unit testing only just a small part of the high-level code in KREEP was not such a bad idea, since it is not the type of testing method which is able to cover a lot of functionality with a small time investment. So in the meantime, I’ve realized, that I have to find a cheap/smart way of testing the actual games I make, in an automatic fashion.

I’ve decided, that unit testing works perfectly for the framework code, but I have to reach a much higher test level for the game projects. My other requirements were, that it has to be stable (as deterministic as possible), simple to automatize, and really easy to create or extend test cases which are able to cover a lot of ground. Yep no worries, this is a trivial task 😀 !
I’ve been working on this testing method in the last three to four weeks, and I believe I’ve arrived at a really good compromise. I’m not going to go into too much detail in this post (I want to leave some stuff to talk about for next time 🙂 ), but here goes the overall design:
The testing system and work-flow is based on “capture and replay”. The framework provides an extensible system to capture events while you are playing and a mirror construct for replaying them (e.g.: like input device events, key presses, mouse moves etc…, but the client can define event types and capture+replay mechanisms). Other than replaying input events, replay files themselves can be extended to be filled with various checks to be done at a certain time or frame, and with some reflection magic, even tiny details can be checked in the game-world. This way, you can capture game-play footage as you are playing with the game, it can be replayed any time later, and it is easy to create test cases and to build a regression out of your recordings by adding various asserts and checks at certain points in them. I did my homework, I know all the positives and short-comings of “capture and replay” based testing. I worked my ass off to come up with a good solution for most of the problems, or at least to make the system work for me, instead of against me.

Most of the implementation is done. I’ve already hooked it up into NUnit, so replay based test cases can be executed with the NUnit runner (I use NUnit for unit testing too, so it was a natural choice), and the whole concept seem to work surprisingly well! I’m really proud of the result :). Testing the final build of my next game will be a breeze :).

In my next post (probably sometime around next week) I’m going to talk about details of my implementation and how I’ve approached the design of the system to achieve my requirements.
Until then, I wish you and your family a merry Christmas, and if I happen to be too busy (or lazy 😀 ) during the holiday and postpone the next post, a happy new year too!